I am also happy with max/min/99/90. And I buy your naming argument about staleness vs latency.
-Matthias On 5/15/20 12:24 PM, Boyang Chen wrote: > Hey Sophie, > > 90/99/min/max make sense to me. > > On Fri, May 15, 2020 at 12:20 PM Sophie Blee-Goldman <sop...@confluent.io> > wrote: > >> @Matthias >> Regarding tracking the 50th percentile, I'll refer you to the 4:53 mark of >> the video >> *you* linked: https://youtu.be/lJ8ydIuPFeU?t=293 >> >> And that's what he says about the 95th percentile! Imagine what he would >> say about >> the 50th :P >> >> But seriously, since we can't seem to agree that the mean or 50th >> percentile is actually >> useful I'm inclined to resurrect my original proposal, neither. But I think >> that's a good >> argument against the 75th, which I admittedly chose somewhat arbitrarily as >> an >> intermediate between the 50th and the higher percentiles. How about: >> >> -max >> -p99 >> -p90 >> -min >> >> with p50/mean still up for debate if anyone feels strongly for either of >> them. >> >> Regarding the name, I've already flip-flopped on this so I'm definitely >> still open to >> further arguments. But the reason for changing it from end-to-end-latency >> (which >> is similar to what you propose) is that this metric technically reflects >> how old (ie how "stale") >> the record is when it's *received* by the operator, not when it's processed >> by the operator. >> It seemed like there was the potential for confusion that >> "end-to-end-latency" might >> represent the latency from the event creation to the time the processor is >> done >> processing it. >> >> @John >> I'd rather err on the side of "not-enough" metrics as we can always add >> this to the >> stateless metrics later on. If we decide to measure the time at every node >> and don't >> find any evidence of a serious performance impact, and users indicate they >> would >> like to see this metric at all nodes, then we can easily start reporting >> them as well. >> WDYT? >> >> That said, sink nodes seem like a reasonable exception to the rule. >> Obviously users >> should be able to detect the time when the record reaches the output topic >> but that >> still leaves a gap in understanding how long the production latency was. >> This mirrors >> the consumption latency that is exposed by the task-level metrics, which >> are measured >> at the source node. For good symmetry what if we actually expose both the >> source >> and sink latency at the task-level? ie report both sets of statistical >> measurements with >> the additional tag -source/-sink >> >> @Bill >> Thanks for the comment about regarding the min! I hadn't considered that >> and it's >> quite useful to think about how and what is useful from a users point of >> view. >> >> Regarding your second. point, I'm inclined to leave that as an >> implementation detail >> but my take would be that the user should be allowed to control the record >> timestamp >> used for this with the timestamp extractor. My impression is that users may >> often embed >> the actual event time in the payload for whatever reason, and this >> represents the "true" >> timestamp as far as the Streams topology is concerned. >> >> >> On Fri, May 15, 2020 at 11:05 AM Bill Bejeck <bbej...@gmail.com> wrote: >> >>> Thanks for the KIP, Sophie, this will be a useful metric to add. >>> >>> Regarding tracking min, I think it could be valuable for users to >> discern >>> which part of their topologies are more efficient since this is a >>> task-level metric. I realize everyone seems to be on board with >> including >>> min anyway, but I wanted to add my 2 cents on this topic should we decide >>> to revisit adding min or not. >>> >>> I do have a question regarding the calculation of staleness. >>> Is there going to be a consideration for timestamp extractors? Users >> could >>> prefer to use a timestamp embedded in the payload, and it could skew the >>> measurements. >>> I was wondering if we should specify in the KIP if setting the arrival >> time >>> is always going to come from the record timestamp, or is this an >>> implementation detail we can cover in the PR? >>> >>> Thanks! >>> Bill >>> >>> On Fri, May 15, 2020 at 1:11 AM Matthias J. Sax <mj...@apache.org> >> wrote: >>> >>>> Thanks for the KIP Sophie. >>>> >>>> I think it's not useful to record the avg/mean; it sensitive to >>>> outliers. We should rather track the median (50th percentile). >>>> >>>> Not sure if tracking min is useful, but I am also ok to track it. >>>> >>>> However, I find it odd to track 75th percentile. Standard measures >> would >>>> the 90th or 95th -- I guess we don't need both, so maybe picking 90th >>>> might be more useful? >>>> >>>> About the name: "staleness" wound really odd, and if fact the metric >>>> does capture "latency" so we should call it "latency". I understand the >>>> issue that we already have a latency metric. So maybe we could call it >>>> `record-e2e-latency-*` ? >>>> >>>> While I agree that we should include out-or-order data (the KIP should >>>> talk about `out-of-order` data, not `late` data; data is only `late` if >>>> it's out-of-order and if it's dropped), I don't really understand why >>>> the new metric would help to configure grace period or retention time? >>>> As you mention in the KIP, both are define as max difference of >>>> `event-time - stream-time` and thus the new metric that takes >>>> system-/wallclock-time into account does not seem to help at all. >>>> >>>> >>>> Btw: there is a great talk about "How NOT to Measure Latency" by Gil >>>> Tene: https://www.youtube.com/watch?v=lJ8ydIuPFeU >>>> >>>> >>>> -Matthias >>>> >>>> >>>> On 5/14/20 7:17 PM, John Roesler wrote: >>>>> Hi Sophie, >>>>> >>>>> It seems like there would still be plenty of use cases for recording >>>>> this metric at all processors and not just stateful ones, but I'm >> happy >>>>> to suspend my arguments for now. Since you're proposing to keep >>>>> them at the processor-node level, it will be seamless later to add >>>>> in the stateless processors if we want. As a wise man once said, >>>>> "Adding is always easier than removing." >>>>> >>>>> Regarding the time measurement, it's an implementation detail >>>>> we don't need to consider in the KIP. Nevertheless, I'd greatly >>>>> prefer to measure the system time again when recording the >>>>> metric. I don't think we've seen any evidence that proves this >>>>> would harm performance, and the amount of inaccuracy using >>>>> the cached system time could incur is actually substantial. But, >>>>> if you want to just "not mention this" in the KIP, we can defer to >>>>> the actual PR discussion, at which time we're in a better position >>>>> to use benchmarks, etc., to make the call. >>>>> >>>>> Along the lines of the measurement accuracy discussion, one >>>>> minor thought I had is that maybe we should consider measuring >>>>> the task staleness metric at the sink, rather than the source, so >> that >>>>> it includes the processing latency of the task itself, not just the >>>> latency >>>>> of everything up to, but not including, the task (which seems >> confusing >>>>> for users). I guess this could also be an implementation detail, >>> though. >>>>> >>>>> Thanks for the update, >>>>> -John >>>>> >>>>> On Thu, May 14, 2020, at 13:31, Sophie Blee-Goldman wrote: >>>>>> Hey all, >>>>>> >>>>>> After discussing with Bruno I'd like to propose a small amendment, >>>>>> which is to record the processor-node-level metrics only for >>> *stateful* >>>>>> *operators*. They would still be considered a "processor-node-level" >>>>>> metric and not a "state-store-level" metric as the staleness is >> still >>>>>> a property of the node rather than of the state itself. However, it >>>> seems >>>>>> that this information is primarily useful for stateful operators >> that >>>> might >>>>>> be exposing state via IQ or otherwise dependent on the record time >>>>>> unlike a stateless operator. >>>>>> >>>>>> It's worth calling out that recent performance improvements to the >>>> metrics >>>>>> framework mean that we no longer fetch the system time at the >> operator >>>>>> level, but only once per task. In other words the system time is not >>>> updated >>>>>> between each process as a record flows through the subtopology, so >>>>>> debugging the processor-level latency via the stateleness will not >> be >>>>>> possible.Note that this doesn't mean the operator-level metrics are >>> not >>>>>> *useful* relative to the task-level metric. Upstream caching and/or >>>>>> suppression >>>>>> can still cause a record's staleness at some downstream stateful >>>> operator >>>>>> to deviate from the task-level staleness (recorded at the source >>> node). >>>>>> >>>>>> Please let me know if you have any concerns about this change. The >>>>>> KIP has been updated with the new proposal >>>>>> >>>>>> On Thu, May 14, 2020 at 3:04 AM Bruno Cadonna <br...@confluent.io> >>>> wrote: >>>>>> >>>>>>> Hi Sophie, >>>>>>> >>>>>>> Thank you for the KIP. >>>>>>> >>>>>>> The KIP looks good to me. >>>>>>> >>>>>>> 50th percentile: >>>>>>> I think we do not need it now. If we need it, we can add it. Here >> the >>>>>>> old truism applies: Adding is always easier than removing. >>>>>>> >>>>>>> processor-node-level metrics: >>>>>>> I think it is good to have the staleness metrics also on >>>>>>> processor-node-level. If we do not want to record them on all >>>>>>> processor nodes, you could restrict the recording to stateful >>>>>>> processor-nodes, since those are the ones that would benefit most >>> from >>>>>>> the staleness metrics. >>>>>>> >>>>>>> Best, >>>>>>> Bruno >>>>>>> >>>>>>> On Thu, May 14, 2020 at 4:15 AM Sophie Blee-Goldman < >>>> sop...@confluent.io> >>>>>>> wrote: >>>>>>>> >>>>>>>> Yeah, the specific reason was just to align with the current >>> metrics. >>>>>>>> >>>>>>>> Is it better to conform than to be right? History has a lot to say >>> on >>>>>>> that >>>>>>>> matter >>>>>>>> but I'm not sure how much of it applies to the fine details of >>> metrics >>>>>>>> naming :P >>>>>>>> >>>>>>>> More seriously, I figured if people are looking at this metric >>> they're >>>>>>>> likely to >>>>>>>> be looking at all the others. Then naming this one "-mean" would >>>> probably >>>>>>>> lead some to conclude that the "-avg" suffix in the other metrics >>> has >>>> a >>>>>>>> different meaning. >>>>>>>> >>>>>>>> As for the percentiles, I actually like p99 (and p75) better. I'll >>>> swap >>>>>>>> that out >>>>>>>> >>>>>>>> On Wed, May 13, 2020 at 7:07 PM John Roesler <vvcep...@apache.org >>> >>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks Sophie, >>>>>>>>> >>>>>>>>> I hope this isn't too nit-picky, but is there a reason to choose >>>> "avg" >>>>>>>>> instead >>>>>>>>> of "mean"? Maybe this is too paranoid, and I might be >> oversensitive >>>>>>> because >>>>>>>>> of the mistake I just made earlier, but it strikes me that "avg" >> is >>>>>>>>> actually >>>>>>>>> ambiguous, as it refers to a family of statistics, whereas "mean" >>> is >>>>>>>>> specific. >>>>>>>>> I see other Kafka metrics with "avg", but none with "mean"; was >>> that >>>>>>> the >>>>>>>>> reason? If so, I'm +1. >>>>>>>>> >>>>>>>>> Regarding the names of the percentile, I actually couldn't find >>> _any_ >>>>>>> other >>>>>>>>> metrics that use percentile. Was there a reason to choose "99th" >> as >>>>>>> opposed >>>>>>>>> to "p99" or any other scheme? This is not a criticism, I'm just >>>>>>> primarily >>>>>>>>> asking >>>>>>>>> for consistency's sake. >>>>>>>>> >>>>>>>>> Thanks again, >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Wed, May 13, 2020, at 19:19, Sophie Blee-Goldman wrote: >>>>>>>>>> Alright, I can get behind adding the min metric for the sake of >>>>>>> pretty >>>>>>>>>> graphs >>>>>>>>>> (and trivial computation). >>>>>>>>>> >>>>>>>>>> I'm still on the fence regarding the mean (or 50th percentile) >>> but I >>>>>>> can >>>>>>>>> see >>>>>>>>>> how users might expect it and find it a bit disorienting not to >>>>>>> have. So >>>>>>>>> the >>>>>>>>>> updated proposed metrics are >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> - record-staleness-max [ms] >>>>>>>>>> - record-staleness-99th [ms] *(99th percentile)* >>>>>>>>>> - record-staleness-75th [ms] *(75th percentile)* >>>>>>>>>> - record-staleness-avg [ms] *(mean)* >>>>>>>>>> - record-staleness-min [ms] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, May 13, 2020 at 4:42 PM John Roesler < >> vvcep...@apache.org >>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Oh boy, I never miss an opportunity to embarrass myself. I >> guess >>>>>>> the >>>>>>>>> mean >>>>>>>>>>> seems more interesting to me than the median, but neither are >> as >>>>>>>>>>> interesting as the higher percentiles (99th and max). >>>>>>>>>>> >>>>>>>>>>> Min isn’t really important for any SLAs, but it does round out >>> the >>>>>>>>> mental >>>>>>>>>>> picture of the distribution. I’ve always graphed min along with >>> the >>>>>>>>> other >>>>>>>>>>> metrics to help me understand how fast the system can be, which >>>>>>> helps >>>>>>>>> in >>>>>>>>>>> optimization decisions. It’s also a relatively inexpensive >> metric >>>>>>> to >>>>>>>>>>> compute, so it might be nice to just throw it in. >>>>>>>>>>> >>>>>>>>>>> On Wed, May 13, 2020, at 18:18, Sophie Blee-Goldman wrote: >>>>>>>>>>>> G1: >>>>>>>>>>>> I was considering it as the "end-to-end latency *up* to the >>>>>>> specific >>>>>>>>>>> task" >>>>>>>>>>>> but >>>>>>>>>>>> I'm happy with "record-staleness" if that drives the point >> home >>>>>>>>> better. >>>>>>>>>>> So >>>>>>>>>>>> it's the >>>>>>>>>>>> "staleness of the record when it is received by that task" -- >>>>>>> will >>>>>>>>> update >>>>>>>>>>>> the KIP >>>>>>>>>>>> >>>>>>>>>>>> B1/J: >>>>>>>>>>>> I'm struggling to imagine a case where the min would actually >> be >>>>>>>>> useful, >>>>>>>>>>>> rather than >>>>>>>>>>>> just intellectually interesting. I don't feel strongly that we >>>>>>>>> shouldn't >>>>>>>>>>>> add it, but that's >>>>>>>>>>>> why I didn't include it from the start. Can you enlighten me >>>>>>> with an >>>>>>>>>>>> example? >>>>>>>>>>>> >>>>>>>>>>>> I was also vaguely concerned about the overhead of adding >>>>>>> multiple >>>>>>>>>>>> percentile >>>>>>>>>>>> metrics. Do we have any data to indicate what kind of >>> performance >>>>>>>>> hit we >>>>>>>>>>>> take on >>>>>>>>>>>> metrics computation? >>>>>>>>>>>> >>>>>>>>>>>> Also, not to be too pedantic but the 50th percentile would be >>> the >>>>>>>>> median >>>>>>>>>>>> not the >>>>>>>>>>>> mean. Would you propose to add the mean *and* the 50th >>>>>>> percentile, or >>>>>>>>>>> just >>>>>>>>>>>> one >>>>>>>>>>>> of the two? >>>>>>>>>>>> >>>>>>>>>>>> Thanks all! >>>>>>>>>>>> Sophie >>>>>>>>>>>> >>>>>>>>>>>> On Wed, May 13, 2020 at 3:34 PM John Roesler < >>>>>>> vvcep...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello all, and thanks for the KIP, Sophie, >>>>>>>>>>>>> >>>>>>>>>>>>> Just some comments on the discussion so far: >>>>>>>>>>>>> >>>>>>>>>>>>> B2/G1: >>>>>>>>>>>>> In principle, it shouldn't matter whether we report "spans" >> or >>>>>>>>>>>>> "end-to-end" latency. But in practice, some of the spans are >>>>>>> pretty >>>>>>>>>>>>> difficult to really measure (like time spent waiting in the >>>>>>>>> topics, or >>>>>>>>>>>>> time from the event happening to the ETL producer choosing to >>>>>>> send >>>>>>>>> it, >>>>>>>>>>>>> or time spent in send/receive buffers, etc., etc. >>>>>>>>>>>>> >>>>>>>>>>>>> In other words, it's practically easier to compute spans by >>>>>>>>> subtracting >>>>>>>>>>>>> e2e latencies than it is to compute e2e latencies by adding >>>>>>> spans. >>>>>>>>> You >>>>>>>>>>>>> can even just consider that the span computation from e2e >>>>>>> always >>>>>>>>> just >>>>>>>>>>>>> involves subtracting two numbers, whereas computing e2e >> latency >>>>>>>>> from >>>>>>>>>>>>> spans involves adding _all_ the spans leading up to the end >> you >>>>>>>>> care >>>>>>>>>>> about. >>>>>>>>>>>>> >>>>>>>>>>>>> It seems like people really prefer to have spans when they >> are >>>>>>>>>>> debugging >>>>>>>>>>>>> latency problems, whereas e2e latency is a more general >>>>>>> measurement >>>>>>>>>>>>> that basically every person/application cares about and >> should >>>>>>> be >>>>>>>>>>>>> monitoring. >>>>>>>>>>>>> >>>>>>>>>>>>> Altogether, it really seem to provide more value to more >>>>>>> people if >>>>>>>>> we >>>>>>>>>>>>> report >>>>>>>>>>>>> e2e latencies. Regarding "record-staleness" as a name, I >> think >>>>>>> I >>>>>>>>> have >>>>>>>>>>> no >>>>>>>>>>>>> preference, I'd defer to other peoples' intuition. >>>>>>>>>>>>> >>>>>>>>>>>>> G2: >>>>>>>>>>>>> I think the processor-node metric is nice, since the inside >> of >>>>>>> a >>>>>>>>> task >>>>>>>>>>> can >>>>>>>>>>>>> introduce a significant amount of latency in some cases. >> Plus, >>>>>>>>> it's a >>>>>>>>>>> more >>>>>>>>>>>>> direct measurement, if you really wanted to know (for the >>>>>>> purposes >>>>>>>>> of >>>>>>>>>>> IQ >>>>>>>>>>>>> or something) how long it takes source events to "show up" at >>>>>>> the >>>>>>>>>>> store. >>>>>>>>>>>>> >>>>>>>>>>>>> I think actually recording it at every processor could be >>>>>>>>> expensive, >>>>>>>>>>> but we >>>>>>>>>>>>> already record a bunch of metrics at the node level. >>>>>>>>>>>>> >>>>>>>>>>>>> B1: >>>>>>>>>>>>> I think 50% could be reasonable to record also. Even if it's >> a >>>>>>> poor >>>>>>>>>>> metric >>>>>>>>>>>>> for operational purposes, a lot of people might expect to see >>>>>>>>> "mean". >>>>>>>>>>>>> Actually, >>>>>>>>>>>>> I was surprised not to see "min". Is there a reason to leave >> it >>>>>>>>> off? >>>>>>>>>>>>> >>>>>>>>>>>>> I might suggest: >>>>>>>>>>>>> min, mean (50th), 75th, 99th, max >>>>>>>>>>>>> >>>>>>>>>>>>> B3: >>>>>>>>>>>>> I agree we should include late records (though not the ones >> we >>>>>>>>> drop). >>>>>>>>>>>>> It may be spiky, but only when there are legitimately some >>>>>>> records >>>>>>>>>>> with a >>>>>>>>>>>>> high end-to-end latency, which is the whole point of these >>>>>>> metrics. >>>>>>>>>>>>> >>>>>>>>>>>>> That's it! I don't think I have any other feedback, other >> than >>>>>>> a >>>>>>>>>>> request to >>>>>>>>>>>>> also report "min". >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> -John >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, May 13, 2020, at 16:58, Guozhang Wang wrote: >>>>>>>>>>>>>> Thanks Sophie for the KIP, a few quick thoughts: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1) The end-to-end latency includes both the processing >>>>>>> latency >>>>>>>>> of the >>>>>>>>>>>>> task >>>>>>>>>>>>>> and the latency spent sitting in intermediate topics. I have >>>>>>> a >>>>>>>>>>> similar >>>>>>>>>>>>>> feeling as Boyang mentioned above that the latency metric of >>>>>>> a >>>>>>>>> task A >>>>>>>>>>>>>> actually measures the latency of the sub-topology up-to but >>>>>>> not >>>>>>>>>>> including >>>>>>>>>>>>>> the processing of A, which is a bit weird. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Maybe the my feeling comes from the name "latency" itself, >>>>>>> since >>>>>>>>>>> today we >>>>>>>>>>>>>> already have several "latency" metrics already which are >>>>>>>>> measuring >>>>>>>>>>>>> elapsed >>>>>>>>>>>>>> system-time for processing a record / etc, while here we are >>>>>>>>>>> comparing >>>>>>>>>>>>> the >>>>>>>>>>>>>> system wallclock time with the record timestamp. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Maybe we can consider renaming it as "record-staleness" >>>>>>> (note we >>>>>>>>>>> already >>>>>>>>>>>>>> have a "record-lateness" metric), in which case recording at >>>>>>> the >>>>>>>>>>>>>> system-time before we start processing the record sounds >> more >>>>>>>>>>> natural. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2) With that in mind, I'm wondering if the >>>>>>> processor-node-level >>>>>>>>> DEBUG >>>>>>>>>>>>>> metric is worth to add, given that we already have a >>>>>>> task-level >>>>>>>>>>>>> processing >>>>>>>>>>>>>> latency metric. Basically, a specific node's e2e latency is >>>>>>>>> similar >>>>>>>>>>> to >>>>>>>>>>>>> the >>>>>>>>>>>>>> task-level e2e latency + task-level processing latency. >>>>>>>>> Personally I >>>>>>>>>>>>> think >>>>>>>>>>>>>> having a task-level record-staleness metric is sufficient. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Guozhang >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 13, 2020 at 11:46 AM Sophie Blee-Goldman < >>>>>>>>>>>>> sop...@confluent.io> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. I felt that 50% was not a particularly useful gauge for >>>>>>> this >>>>>>>>>>>>> specific >>>>>>>>>>>>>>> metric, as >>>>>>>>>>>>>>> it's presumably most useful at putting an *upper *bound on >>>>>>> the >>>>>>>>>>> latency >>>>>>>>>>>>> you >>>>>>>>>>>>>>> can >>>>>>>>>>>>>>> reasonably expect to see. I chose percentiles that would >>>>>>>>> hopefully >>>>>>>>>>>>> give a >>>>>>>>>>>>>>> good >>>>>>>>>>>>>>> sense of what *most* records will experience, and what >>>>>>> *close >>>>>>>>> to >>>>>>>>>>> all* >>>>>>>>>>>>>>> records >>>>>>>>>>>>>>> will. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However I'm not married to these specific numbers and >>>>>>> could be >>>>>>>>>>>>> convinced. >>>>>>>>>>>>>>> Would be especially interested in hearing from users on >>>>>>> this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. I'm inclined to not include the "hop-to-hop latency" in >>>>>>>>> this KIP >>>>>>>>>>>>> since >>>>>>>>>>>>>>> users >>>>>>>>>>>>>>> can always compute it themselves by subtracting the >>>>>>> previous >>>>>>>>> node's >>>>>>>>>>>>>>> end-to-end latency. I guess we could do it either way since >>>>>>>>> you can >>>>>>>>>>>>> always >>>>>>>>>>>>>>> compute one from the other, but I think the end-to-end >>>>>>> latency >>>>>>>>>>> feels >>>>>>>>>>>>> more >>>>>>>>>>>>>>> valuable as it's main motivation is not to debug >>>>>>> bottlenecks >>>>>>>>> in the >>>>>>>>>>>>>>> topology but >>>>>>>>>>>>>>> to give users a sense of how long it takes arecord to be >>>>>>>>> reflected >>>>>>>>>>> in >>>>>>>>>>>>>>> certain parts >>>>>>>>>>>>>>> of the topology. For example this might be useful for users >>>>>>>>> who are >>>>>>>>>>>>>>> wondering >>>>>>>>>>>>>>> roughly when a record that was just produced will be >>>>>>> included >>>>>>>>> in >>>>>>>>>>> their >>>>>>>>>>>>> IQ >>>>>>>>>>>>>>> results. >>>>>>>>>>>>>>> Debugging is just a nice side effect -- but maybe I didn't >>>>>>> make >>>>>>>>>>> that >>>>>>>>>>>>> clear >>>>>>>>>>>>>>> enough >>>>>>>>>>>>>>> in the KIP's motivation. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 3. Good question, I should address this in the KIP. The >>>>>>> short >>>>>>>>>>> answer is >>>>>>>>>>>>>>> "yes", >>>>>>>>>>>>>>> we will include late records. I added a paragraph to the >>>>>>> end >>>>>>>>> of the >>>>>>>>>>>>>>> Proposed >>>>>>>>>>>>>>> Changes section explaining the reasoning here, please let >>>>>>> me >>>>>>>>> know >>>>>>>>>>> if >>>>>>>>>>>>> you >>>>>>>>>>>>>>> have >>>>>>>>>>>>>>> any concerns. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 4. Assuming you're referring to the existing metric >>>>>>>>>>> "process-latency", >>>>>>>>>>>>> that >>>>>>>>>>>>>>> metric >>>>>>>>>>>>>>> reflects the time for the literal Node#process method to >>>>>>> run >>>>>>>>>>> whereas >>>>>>>>>>>>> this >>>>>>>>>>>>>>> metric >>>>>>>>>>>>>>> would always be measured relative to the event timestamp. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> That said, the naming collision there is pretty confusing >>>>>>> so >>>>>>>>> I've >>>>>>>>>>>>> renamed >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> metrics in this KIP to "end-to-end-latency" which I feel >>>>>>> better >>>>>>>>>>>>> reflects >>>>>>>>>>>>>>> the nature >>>>>>>>>>>>>>> of the metric anyway. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for the feedback! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, May 13, 2020 at 10:21 AM Boyang Chen < >>>>>>>>>>>>> reluctanthero...@gmail.com> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for the KIP Sophie. Getting the E2E latency is >>>>>>>>> important >>>>>>>>>>> for >>>>>>>>>>>>>>>> understanding the bottleneck of the application. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> A couple of questions and ideas: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. Could you clarify the rational of picking 75, 99 and >>>>>>> max >>>>>>>>>>>>> percentiles? >>>>>>>>>>>>>>>> Normally I see cases where we use 50, 90 percentile as >>>>>>> well >>>>>>>>> in >>>>>>>>>>>>> production >>>>>>>>>>>>>>>> systems. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2. The current latency being computed is cumulative, I.E >>>>>>> if a >>>>>>>>>>> record >>>>>>>>>>>>> goes >>>>>>>>>>>>>>>> through A -> B -> C, then P(C) = T(B->C) + P(B) = >>>>>>> T(B->C) + >>>>>>>>>>> T(A->B) + >>>>>>>>>>>>>>> T(A) >>>>>>>>>>>>>>>> and so on, where P() represents the captured latency, >>>>>>> and T() >>>>>>>>>>>>> represents >>>>>>>>>>>>>>>> the time for transiting the records between two nodes, >>>>>>>>> including >>>>>>>>>>>>>>> processing >>>>>>>>>>>>>>>> time. For monitoring purpose, maybe having T(B->C) and >>>>>>>>> T(A->B) >>>>>>>>>>> are >>>>>>>>>>>>> more >>>>>>>>>>>>>>>> natural to view as "hop-to-hop latency", otherwise if >>>>>>> there >>>>>>>>> is a >>>>>>>>>>>>> spike in >>>>>>>>>>>>>>>> T(A->B), both P(B) and P(C) are affected in the same >>>>>>> time. >>>>>>>>> In the >>>>>>>>>>>>> same >>>>>>>>>>>>>>>> spirit, the E2E latency is meaningful only when the >>>>>>> record >>>>>>>>> exits >>>>>>>>>>>>> from the >>>>>>>>>>>>>>>> sink as this marks the whole time this record spent >>>>>>> inside >>>>>>>>> the >>>>>>>>>>>>> funnel. Do >>>>>>>>>>>>>>>> you think we could have separate treatment for sink >>>>>>> nodes and >>>>>>>>>>> other >>>>>>>>>>>>>>>> nodes, so that other nodes only count the time receiving >>>>>>> the >>>>>>>>>>> record >>>>>>>>>>>>> from >>>>>>>>>>>>>>>> last hop? I'm not proposing a solution here, just want to >>>>>>>>> discuss >>>>>>>>>>>>> this >>>>>>>>>>>>>>>> alternative to see if it is reasonable. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 3. As we are going to monitor late arrival records as >>>>>>> well, >>>>>>>>> they >>>>>>>>>>>>> would >>>>>>>>>>>>>>>> create some really spiky graphs when the out-of-order >>>>>>>>> records are >>>>>>>>>>>>>>>> interleaving with on time records. Should we also supply >>>>>>> a >>>>>>>>> smooth >>>>>>>>>>>>> version >>>>>>>>>>>>>>>> of the latency metrics, or user should just take care of >>>>>>> it >>>>>>>>> by >>>>>>>>>>>>> themself? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 4. Regarding this new metrics, we haven't discussed its >>>>>>>>> relation >>>>>>>>>>>>> with our >>>>>>>>>>>>>>>> existing processing latency metrics, could you add some >>>>>>>>> context >>>>>>>>>>> on >>>>>>>>>>>>>>>> comparison and a simple `when to use which` tutorial for >>>>>>> the >>>>>>>>>>> best? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Boyang >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, May 12, 2020 at 7:28 PM Sophie Blee-Goldman < >>>>>>>>>>>>> sop...@confluent.io >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hey all, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'd like to kick off discussion on KIP-613 which aims >>>>>>> to >>>>>>>>> add >>>>>>>>>>>>> end-to-end >>>>>>>>>>>>>>>>> latency metrics to Streams. Please take a look: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-613%3A+Add+end-to-end+latency+metrics+to+Streams >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Sophie >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- Guozhang >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>> >>>> >>> >> >
signature.asc
Description: OpenPGP digital signature