Re: [DISCUSS] KIP-489 Kafka Consumer Record Latency Metric

Habib Nahas Thu, 12 Dec 2019 03:18:56 -0800

Hi Sean,

Thanks for the KIP.


As I understand it users are free to set their own timestamp on ProducerRecord. 
What is the recommendation for the proposed metric in a scenario where the user 
sets this timestamp in timezone A and consumes the record in timezone B. Its 
not clear to me if a custom implementation of LatencyTime will help here.

Thanks,
Habib

On Wed, Dec 11, 2019, at 4:52 PM, Sean Glover wrote:
> Hello again,
> 
> There has been some interest in this KIP recently. I'm bumping the thread
> to encourage feedback on the design.
> 
> Regards,
> Sean
> 
> On Mon, Jul 15, 2019 at 9:01 AM Sean Glover <[email protected]>
> wrote:
> 
> > To hopefully spark some discussion I've copied the motivation section from
> > the KIP:
> >
> > Consumer lag is a useful metric to monitor how many records are queued to
> > be processed. We can look at individual lag per partition or we may
> > aggregate metrics. For example, we may want to monitor what the maximum lag
> > of any particular partition in our consumer subscription so we can identify
> > hot partitions, caused by an insufficient producing partitioning strategy.
> > We may want to monitor a sum of lag across all partitions so we have a
> > sense as to our total backlog of messages to consume. Lag in offsets is
> > useful when you have a good understanding of your messages and processing
> > characteristics, but it doesn’t tell us how far behind *in time* we are.
> > This is known as wait time in queueing theory, or more informally it’s
> > referred to as latency.
> >
> > The latency of a message can be defined as the difference between when
> > that message was first produced to when the message is received by a
> > consumer. The latency of records in a partition correlates with lag, but a
> > larger lag doesn’t necessarily mean a larger latency. For example, a topic
> > consumed by two separate application consumer groups A and B may have
> > similar lag, but different latency per partition. Application A is a
> > consumer which performs CPU intensive business logic on each message it
> > receives. It’s distributed across many consumer group members to handle the
> > load quickly enough, but since its processing time is slower, it takes
> > longer to process each message per partition. Meanwhile, Application B is
> > a consumer which performs a simple ETL operation to land streaming data in
> > another system, such as HDFS. It may have similar lag to Application A, but
> > because it has a faster processing time its latency per partition is
> > significantly less.
> >
> > If the Kafka Consumer reported a latency metric it would be easier to
> > build Service Level Agreements (SLAs) based on non-functional requirements
> > of the streaming system. For example, the system must never have a latency
> > of greater than 10 minutes. This SLA could be used in monitoring alerts or
> > as input to automatic scaling solutions.
> >
> > On Thu, Jul 11, 2019 at 12:36 PM Sean Glover <[email protected]>
> > wrote:
> >
> >> Hi kafka-dev,
> >>
> >> I've created KIP-489 as a proposal for adding latency metrics to the
> >> Kafka Consumer in a similar way as record-lag metrics are implemented.
> >>
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/489%3A+Kafka+Consumer+Record+Latency+Metric
> >>
> >> Regards,
> >> Sean
> >>
> >> --
> >> Principal Engineer, Lightbend, Inc.
> >>
> >> <http://lightbend.com>
> >>
> >> @seg1o <https://twitter.com/seg1o>, in/seanaglover
> >> <https://www.linkedin.com/in/seanaglover/>
> >>
> >
> >
> > --
> > Principal Engineer, Lightbend, Inc.
> >
> > <http://lightbend.com>
> >
> > @seg1o <https://twitter.com/seg1o>, in/seanaglover
> > <https://www.linkedin.com/in/seanaglover/>
> >
>

Re: [DISCUSS] KIP-489 Kafka Consumer Record Latency Metric

Reply via email to