Hi Ryan,

These are not "arguments for observability in general" but descriptions of 
specific issues that come up due to Kafka's lack of support for collecting  
client metrics. He mentioned the fact that configuring client metrics usually 
involves setting up a separate metrics collection infrastructure. Even if this 
is easy and straightforward to do (which is not the case for most 
organizations), it still requires reconfiguring and restarting the application, 
which is disruptive. Correlating client metrics with server metrics is also 
often hard. These issues are all mitigated by centralizing metrics collection 
on the broker.

best,
Colin


On Wed, Jun 16, 2021, at 19:03, Ryanne Dolan wrote:
> Magnus, I think these are arguments for observability in general, but not
> why kafka should sit between a client and a metics collector.
> 
> Ryanne
> 
> On Wed, Jun 16, 2021, 10:27 AM Magnus Edenhill <mag...@edenhill.se> wrote:
> 
> > Hi Ryanne,
> >
> > this proposal stems from a need to improve troubleshooting Kafka issues.
> >
> > As it currently stands, when an application team is experiencing Kafka
> > service degradation,
> > or the Kafka operator is seeing misbehaving clients, there are plenty of
> > steps that needs
> > to be taken before any client-side metrics can be observed at all, if at
> > all:
> >  - Is the application even collecting client metrics? If not it needs to be
> > reconfigured or implemented, and restarted;
> >    a restart may have business impact, and may also temporarily? remedy the
> > problem without giving any further insight
> >    into what was wrong.
> >  - Are the desired metrics collected? Where are they stored? For how long?
> > Is there enough correlating information
> >    to map it to cluster-side metrics and events? Does the application
> > on-call know how to find the collected metrics?
> >  - Export and send these metrics to whoever knows how to interpret them. In
> > what format? Are all relevant metadata fields
> >    provided?
> >
> > The KIP aims to solve all these obstacles by giving the Kafka operator the
> > tools to collect this information.
> >
> > Regards,
> > Magnus
> >
> >
> > Den tis 15 juni 2021 kl 02:37 skrev Ryanne Dolan <ryannedo...@gmail.com>:
> >
> > > Magnus, I think such a substantial change requires more motivation than
> > is
> > > currently provided. As I read it, the motivation boils down to this: you
> > > want your clients to phone-home unless they opt-out. As stated in the
> > KIP,
> > > "there are plenty of existing solutions [...] to send metrics [...] to a
> > > collector", so the opt-out appears to be the only motivation. Am I
> > missing
> > > something?
> > >
> > > Ryanne
> > >
> > > On Wed, Jun 2, 2021 at 7:46 AM Magnus Edenhill <mag...@edenhill.se>
> > wrote:
> > >
> > > > Hey all,
> > > >
> > > > I'm proposing KIP-714 to add remote Client metrics and observability.
> > > > This functionality will allow centralized monitoring and
> > troubleshooting
> > > of
> > > > clients and their internals.
> > > >
> > > > Please see
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability
> > > >
> > > > Looking forward to your feedback!
> > > >
> > > > Regards,
> > > > Magnus
> > > >
> > >
> >
> 

Reply via email to