Hello devs,

We recently had an incident where the master was overloaded by the
scheduler's ACKNOWLEDGE requests, causing the http api latencies to spike.
I have two questions:
- what is the best way to instrument the http api to emit latency metrics?
- what's the best way to monitor the master's load, in addition to the api
latencies?

apparently monitoring cpu doesn't help much as the master will never
saturate a machine with more than 2 cpus. any guidance on this would be
much appreciated.

Thanks!
Eric

Reply via email to