• Skip to main content
  • About us
    • Our story
    • Our user groups
    • Our partners
    • Our sustainability strategy
    • Our environmental responsibilities
    • Our social value
    • Our business responsibilities
    • Our people and culture
    • Careers
  • Products
    • EMIS Web
    • EMIS-X
    • ProScript Connect
    • PharmOutcomes
    • PHM Pathfinder Analytics
    • ScriptSwitch Prescribing
    • Apex
    • Recruit
    • Pathway
    • Partner products
    • CEMBooks emergency room
    • Hero
    • Joy
  • Healthcare
    • Integrated care systems
    • Primary care
    • Community care
    • Community pharmacy
    • Secondary care
    • Hospice care
    • Collaborative PCN working
    • Medicines Optimisation
    • Data driven transformation
    • Empowering pharmacies
    • GP IT managed service
  • Life sciences
    • Pharmaceutical industry
    • Academic research
    • Proactive care with Pathway
    • Clinical trial recruitment
    • Unlocking insights with Explorer
  • News and insights
    • Customer stories
    • News
    • Articles
    • Blogs
    • Newsletters
  • Events
  • Contact us
  • Optum Help Centre
  • To optum.com
  • Brazil
  • India
  • Ireland
  • United States
  1. Home
  2. News and insights
  3. Blogs
  4. Round-the-clock monitoring

Blogs

Laptop hands

Round-the-clock monitoring

By Robbie Frodsham

Wednesday 2 November 2022

Related Content

  • August Newsletter header 8

    Newsletter

    August 2025 newsletter

    Read more
  • Hand Wash

    News

    COVID-19 Response

    Read more
  • Carer looking after elderly woman

    Blog

    Primary care's vital role in transforming clinical research

    Read more

In my next installment of my blogs on DevOps at EMIS [now Optum], I want to talk about the practice of proactive monitoring. This process is all about trying to prevent an incident before it occurs by proactively monitoring the software and hardware that together make up a live service such as EMIS Web®.

Monitoring is often a hot topic for discussion, because most people only think about monitoring when it fails to prevent an incident occurring. However, on an average day, the EMIS technical teams will catch and fix potential incidents, before you would ever have even known about them.

This is why it’s vital to provide round-the-clock monitoring. EMIS has dedicated IT operations teams who work 24 hours a day, 7 days a week, 365 days of the year, monitoring our live services to make sure they’re running for our users.

As well as the dedicated 24/7 operations teams, our engineering teams in the software development space also play a crucial role in the monitoring and response to issues that arise on our services. We have dedicated technical specialists on call to support the operational teams, and we have other engineers working the DevOps space like myself who take part in monitoring of our services.

So you may be wondering, how do we monitor our live services?

Well that’s done in lots of different ways. First, we must decide on a metric that we think is important for a given service. This may be how much disk space the server has, or how much memory its using.

Once you’ve defined the metric, you then need to establish what the value for that metric is under normal conditions. The answer to this may not be straight forward depending upon the service. If we use the online interface for Patient Access and the NHS App for example, this is always at peak usage on Monday mornings when new appointments are made available. This would be much less than at 4am when everybody is in bed and virtually nobody is using the service.

Once you’ve established what the metric is, and what values to be worried about, you then need to tell somebody about it when it goes wrong. That is done by sending an alert to technical teams through something like an instant messaging tool. The threshold to create an alert has to be carefully considered, it would be pointless for example, to create an alert to notify the technical teams when memory usage increases beyond 60%, if this happens every Monday morning during the busy period. This would lead to engineers becoming used to seeing alerts happen every Monday morning, and potentially fail to spot the genuine alert mixed in with the ones that happen all the time and don’t indicate anything actually going wrong. This is what we call alert fatigue.

Equally you don’t want to set the threshold too low, it should give you enough time to react and prevent an incident. For example, with disk space, there is no point setting the threshold at the point you run out of disk space. This is great for ensuring you never get an alert that is a false positive, but by then it would be too late and fail to prevent the failure. You want to set the alert at a level that allows you to proactively respond and resolve before the space runs out and suffers from a fault.

This process is an iterative one, teams are constantly reviewing and evolving their monitoring to ensure it provides the best balance between creating too many or not enough alerts. Every time a new service is added or changed, this has to be reviewed. In addition, we’re always looking at the tools we use to do our monitoring and looking to see where can improve.

About the author

Robbie headshot

Robbie Frodsham

Senior site reliability engineer

Robbie has worked in Healthcare IT with Optum for 15 years in a number of different roles and has extensive knowledge of IT infrastructure. In his current position, he works with our engineering teams as a DevOps engineer on both new and existing products. He's passionate about DevOps and sharing his operational expertise with others to improve the way Optum delivers solutions to its customers.

  • Links
    • Careers
    • Modern Slavery Act
    • Supplier Code of Conduct
    • Tax strategy
    • Gender Pay Gap Report
  • Contact us
    • Get in touch
    • Media enquiries
    • 0330 024 1269
  • Find us Fulford Grange,
    Micklefield Lane,
    Rawdon,
    Leeds,
    LS19 6BA
    • Get directions
  • Twitter
  • LinkedIn
  • YouTube

© 2026 Optum. All rights reserved.

  • Privacy Policy
  • Cookies Policy
  • Terms of Use
  • Terms & Conditions
  • Compliance