Re: How do you measure the stability of storm topology in production environment？

Stephen Powis Thu, 27 Oct 2016 06:18:08 -0700

We use a statsd metric reporter into a graphite cluster, and have built out
extensive graphs shown in Grafana.  On top of that we use seyren to do
alerting.  Right now we have alerts on the following:


- Spout lag greater than our defined SLAs
- Null reported spout lag - IE if the topology stops reporting metrics (or
just isn't deployed) for a period of time.
- Failed tuple percentage, if this exceeds a threshold
- Thru-put / number of executes - Our topologies should always be doing
something, they're never completely idle.  If we see thru-put drop below a
threshold we'll be alerted.

Hope this helps!  Curious to what others monitor/alert on.

Stephen

On Thu, Oct 27, 2016 at 2:49 AM, Chen Junfeng <[email protected]> wrote:

> What specifications will you use to measure it ?
>
>
>
>
>
> Regard
>
> Junfeng Chen
>

Re: How do you measure the stability of storm topology in production environment？

Reply via email to