Re: Difference in end-to-end Latency calculations for Storm

preethini v Tue, 25 Jul 2017 08:31:36 -0700

Hi,

This gives a great explanation. Thank you very much.


I am using Storm version 1.1.0. Like you mention, I shall change the value
of topology.stats.sample.rate and analyze.

Thanks,
Preethini



On Tue, Jul 25, 2017 at 4:34 PM, Bobby Evans <[email protected]> wrote:

> Latency calculations in storm are a bit harry.  If you really want them to
> be accurate (at the cost of performance) you need to set the
> topology.stats.sample.rate config to 1.0.  Otherwise by default we will
> randomly sub-sample 5% of the tuples and then up multiply it accordingly.
> The complete latency is calculated using the class
>
> https://github.com/apache/storm/blob/master/storm-
> client/src/jvm/org/apache/storm/metric/internal/
> MultiLatencyStatAndMetric.java
>
> on a per spout basis.  I don't remember exactly how they are combined for
> a final number that appears on the UI, but I think it is just a simple
> average.  When a tuple is emitted a timestamp is attached to the tuple.
>
> https://github.com/apache/storm/blob/master/storm-
> client/src/jvm/org/apache/storm/executor/spout/
> SpoutOutputCollectorImpl.java#L128
>
> When the tuple is fully acked, or marked as failed the time taken is
> calculated and recorded.
>
> In older versions of storm this was done in the spout to be sure that the
> same clock was used everywhere.  In newer versions of storm the delta is
> calculated by the acker from the time it gets the first message to the time
> it gets the last message.
>
> Without knowing which version of storm you are using it is hard to tell
> why these numbers might be off.
>
> - Bobby
>
>
>
> On Tuesday, July 25, 2017, 5:23:33 AM CDT, preethini v <
> [email protected]> wrote:
>
>
> Also,
>
> Any hints on how the Storm metrics calculate the "complete latency"?
>
> Thanks,
> Preethini
>
> On Tue, Jul 25, 2017 at 9:48 AM, preethini v <[email protected]>
> wrote:
>
> Hi Bobby,
>
> I am running a simple word count topology. I have 2 worker nodes and a
> nimbus/zookeeper node. The latency between the nodes is < 1ms.
>
> I have synched the clocks of all 3 nodes using NTP.  Is this not
> sufficient ?
>
> Thanks,
> Preethini
>
> On Mon, Jul 24, 2017 at 5:22 PM, Bobby Evans <[email protected]> wrote:
>
> It is really hard to tell without more information.  Off the top of my
> head it might have something to do with the system time on different
> hosts.  Getting the current time in milliseconds is full of issues,
> especially with leap seconds etc, but it is even more problematic between
> machines because the time is not guaranteed to be synced very closely.
> That would be my first guess.  If they are all on the same machine (you are
> not switching hosts), then my next guess would be a bug in the code some
> where, or a misinterpretation of the results.
>
> Do you have a reproducible use case that you can share?
>
> - Bobby
>
>
>
> On Monday, July 24, 2017, 10:13:59 AM CDT, preethini v <
> [email protected]> wrote:
>
>
> Hi,
>
> I measure the latency of a storm topology in the below two ways. And I see
> a huge difference in the values.
>
> *Approach 1*: attach a start time with every tuple. Note the end time for
> that tuple in ack(). Calculate the time delta of start and end times.
>
> Latency value is ~ 104 ms.
>
> *Approach 2*: Using Storm UI parameter "complete Latency" to measure
> latency.
>
> Latency value is ~ 2-3 ms.
>
> Could someone please explain why is there a huge difference in Latency
> calculations?
> If not on timestamp basis, how does storm internal metrics system
> calculate the complete latency?
>
> Thanks,
> Preethini
>
>
>
>

Re: Difference in end-to-end Latency calculations for Storm

Reply via email to