Sorry, we don't support your browser.  Install a modern browser

Notification for slow reponse time#52

?

It would be useful to detect when a site responds too slowly.

All that is need is a simple time value for the delay threshold. If the site responded after that delay, a warning would be sent.

This would allow us to detect an anomaly on a server and distinguish between a slow or down server.

2 years ago
2
Changed the status to
Idea
2 years ago

Hello, thanks for the suggestion. We will consider adding it. Thanks!

2 years ago
?

Yes, we need this too. For us the monitoring is nearly useless without this feature.

2 years ago
1

Degradation of service is generally a great indicator that something more serious is about to happy. I would really like to see this happen too.

2 years ago
2

This is a neat feature but how would you handle the fact that the response time differs based on the location from where it is called?

2 years ago

Hi Niels, you could set it to something like 50-100% above a normal page load time for me, 400ms is typical. If it goes above 600ms I know there could be an issue. Another option is to get an alert if it goes 100% above the current 30 day average.

2 years ago
4
I

I would benefit from this feature also, reporting a site down is not helpful when the latency is to high

a year ago

A timeout is to give up, and assume if you haven’t responded in X time, the thing monitored is DOWN.

We want a warning while it is up, of things getting slow. Not to make things as down (dead in the water).

I’ll give you a few examples:

A. For us, we have a backup link that uses the mobile phone network. Unfortunately due to local conditions, service is not great. We have a booster for the signal.

The indicator the booster has become problematic or that a remote tower has been selected as the best, means our link stays up, our monitor continues to work, but the latency has gone from ~250ms to between 2000 and 7000ms.

Because the service is still fully operational, this is considered OK, it is indeed “up”, as a response will be received - but it is an early warning, and we can resolve this before an actual outage occurs. They are two distinct states - slow is not the same as down.

B. If you’re running a website and it is in the midst of being hit hard, but still functional, it may be slower to respond. By having an early warning, we can respond to this and get on top of it before it is down.

Again, this is a distinct state - and if we catch it early then that gives us time to respond. Again, it is distinct from being down - it’s a great warning indicator to have. And we don’t want to consider the site down just because the latency / response time has gotten slower. Because it isn’t down, and down may be an emergency, where as slow may be a - ok, no panic but let’s get on top of this before it creates a problem. We may need to add more capacity, for example.

C. Some applications, like database replication are sensitive to latency. If we start to see higher latency to a remote site, we may wish to take early action to either resolve it or take the nodes out of the cluster, and then work to resolve the issue (or work with our providers to do that).

Again, here, the early warning is extremely useful - and the site is not down, so we don’t want to call it down (which a very short time out could do). We want to know when we’re exceeding a threshold so we are warned about it - but not panicked by it. Allows us to take preventative action, before a problem further manifests.

Hope that helps clarify Tom.

Geoff

8 months ago

Thanks Geoff.

What about adding two monitors for the same website, with different Monitor Timeout settings and different notification settings? You could set one monitor to a 5-second Monitor Timeout, but have it notify an email that just updates a slack channel (early warning)… and the other monitor to a full 1 minute Monitor Timeout, but have that one notify phones (alert, we’re down).

I just tried this and it seems to work; I have two monitors for the site listed now, one named “CUSTOMER” and the other “CUSTOMER SLOW,” with different Monitor Timeouts, and email notification only for “CUSTOMER SLOW.”

This does cost me an additional monitor, and I agree both the extra work and extra expense would be avoided in an ideal world.

I do think it’s a good idea to do some deliberate testing of Monitor Timeout to make sure the feature always works as advertised. I may set up a page that intentionally takes 10 seconds to respond, just to confirm a 5-second Monitor Timeout will fail as it should.

8 months ago

Hi Tom,

It’s a good idea, and it may work for your circumstances, but here are the challenges with that approach that I see (for our use cases, anyway):

(1) You immediately double your monitors for anywhere you care about latency - which isn’t necessarily the biggest deal, but it’s a little annoying

(2) More importantly, it doesn’t really serve the use cases I mentioned - minimum timeout is 1 second on a monitor.

Because we’re talking about latency - the difference in some cases (like in the DB replication case) of a 50ms jump is significant. Because if you do synchronous replication, it means every write/commit transaction will be slowed by +50ms. Make that 500ms, and this blows out everything - although your monitor is still quite happy, response in < 1000ms, monitor OK, BIG warning sign completely missed.

This would be the same for our WAN link, if it is anywhere under a 1 second ping response, no warning. For a lot of cases that would work for us, but not all. Over fibre 1gbps/1gbps you have a fairly consistent ping, but a jump from 20ms to 999ms would still be an indicator we’re in trouble, and we’d miss it until we were in potentially even more trouble.

For a website, >1 second response would be an OK indicator things have gone awry, so maybe for that case (for us).

(3) It’s plausible (although not overly likely) that two monitors firing at the exact same time, for the same service, to test the same thing could introduce a smidgeon of latency, but with network links as they are… unlikely to cause real grief :) So that’s more of a theoretical than a practical reality one.

(4) Not a challenge, but a point on why _to_ implement the warning on latency exceeding threshold feature (which the UptimeRobot team have committed to above as we have seen, which is great!) - We already have recorded latency information in uptimerobot. Every monitor has it, it’s already built into the system, so what we’re asking for here - a warning before DOWN is using existing data already available - which should hopefully simplify both implementation of the feature (we’re just reading existing data, and seeing if we’ve exceeded tolerance, or not) and reduce overheads on us, as we can tie the warning in to existing monitors - no double up of effort, and we can do sub-second, which is typical when we’re concerned about latency (response time) over availability of any given service.

Geoff

8 months ago
1

That all makes sense.

7 months ago
1
Changed the status to
In Progress
2 months ago
1
?

Amazing! Cant wait

2 months ago

Becareful

a month ago

PLEASE tell me this is coming soon! What a game-changer this will be…

22 days ago