IMO, avoiding the time variation on machines makes total sense. But I feel
that this is a tricky question.


Couple more thoughts:

1)  As per 
http://storm.apache.org/releases/current/Guaranteeing-message-processing.ht
ml

"Storm can detect when the tree of tuples is fully processed and can ack
or fail the spout tuple appropriately."
 

That seems to indicate that when the ACKer has received all the necessary
acks, then it considers the tuple fully processed. If we go by that, and
we define complete latency as the time taken to fully process a tuple,
then it is not necessary to include the time it takes for the ACK to be
delivered to spout.


2) If you include the time it takes to deliver the ACK to the spout, then
we also need to wonder if we should include the time that the spout takes
to process the ACK() call. I am unclear if the spout.ack() throws an
exception what that means to the idea of Œfully processed¹. Here you can
compute delta either immediately before OR immediately after the ACK() is
invoked on the spout


The benefit of including spout¹s ACK() processing time, is that any
optimizations/inefficiencies in the spout's ACK() implementation will be
detectable. 

I wonder if we should split this into two different metricsŠ

- ³delivery latency²  (which ends once the ACKer receives the last ACK
from a bolt) and 
- "complete latency² which includes ACK processing time by spout


 -roshan



On 4/28/16, 8:59 AM, "Jungtaek Lim" <[email protected]> wrote:

>Hi devs,
>
>While thinking about metrics improvements, I doubt how many users know
>that
>what 'exactly' is complete latency. In fact, it's somewhat complicated
>because additional waiting time could be added to complete latency because
>of single-thread model event loop of spout.
>
>Long running nextTuple() / ack() / fail() can affect complete latency but
>it's behind the scene. No latency information provided, and someone even
>didn't know about this characteristic. Moreover, calling nextTuple() could
>be skipped due to max spout waiting, which will make us harder to guess
>when avg. latency of nextTuple() will be provided.
>
>I think separation of threads (tuple handler to separate thread, as JStorm
>provides) would resolve the gap, but it requires our spout logic to be
>thread-safe, so I'd like to find workaround first.
>
>My sketched idea is let Acker decides end time for root tuple.
>There're two subsequent ways to decide start time for root tuple,
>
>1. when Spout about to emit ACK_INIT to Acker (in other words, keep it as
>it is)
>  - Acker sends ack / fail message to Spout with timestamp, and Spout
>calculates time delta
>  - pros. : It's most accurate way since it respects the definition of
>'complete latency'.
>  - cons. : The sync of machine time between machines are very important.
>Milliseconds of precision would be required.
>2. when Acker receives ACK_INIT from Spout
>  - Acker calculates time delta itself, and sends ack / fail message to
>Spout with time delta
>  - pros. : No requirement to sync the time between servers so strictly.
>  - cons. : It doesn't contain the latency to send / receive ACK_INIT
>between Spout and Acker.
>
>Sure we could leave it as is if we decide it doesn't hurt much.
>
>What do you think?
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)

Reply via email to