Re: Kafka streams application unaware of connection to broker lost

Guozhang Wang Fri, 22 Feb 2019 13:47:59 -0800

Javier,

Got it. The proposal from SO should work, while the drawback is that you
need one more full fledged consumer instance to do that.


If you'd like to go a bit deeper, you can actually turn on DEBUG level
logging on the `o.a.k.clients.NetworkClient` class which would print the
following upon node disconnects:

```
Node {} disconnected.
```

Then you can have a very simple grep program that look for this line, and
fire healthcheck actions whenever it exceeds an limit within a sliding
window, e.g.


Guozhang


On Thu, Feb 21, 2019 at 11:23 PM Javier Arias Losada <
javier.ari...@gmail.com> wrote:

> Thank you for your responses!
>
> Guozhang, what you propose seems like a very good way to monitor externally
> the healthiness of consumers, with this combination of metrics (offset
> advance + bytes-in/out) it can be deduced when a consumer is not working.
>
> What we are trying to accomplish is detect this very same situation, but
> from inside the consumer process. The reason is our consumer is running as
> a container task in AWS-ECS; and we have an HTTP healthcheck in the process
> so that whenever the process returns 'unhealthy', the cluster scheduler
> stops that instance.
>
> So our idea is to find the best way to realize from inside the consumer
> that we lost connection to the broker so that we can mark the instance as
> unhealthy.
>
> We found in stackoverflow a way to do it, have a consumer and periodically
> do a listTopics(timeout) call, whenever you lose the connection to the
> cluster, this raises an exception. What do you think? Are there any
> drawbacks with this approach other than one extra consumer? Is it better to
> reuse the same consumer, or create a new consumer every time? it would be
> about every minute, this is the period for healthchecks in our cluster.
>
> Again, thanks.
>
>
>
> El mié., 20 feb. 2019 a las 18:54, Guozhang Wang (<wangg...@gmail.com>)
> escribió:
>
> > Hello Javier,
> >
> > Matthias is right it is a known issue, not only in Streams, but in the
> > underlying producer / consumer clients.
> >
> > For you own healthcheck monitoring, I'd suggest you can consider some
> > following alternatives:
> >
> > 1) Monitor on consumer offsets, and alert when it did not change for a
> long
> > time.
> >
> > 2) Obviously not all scenarios of 1) above is contributed from lost
> > connection, so in addition to it you can also monitor on the embedded
> > consumer / producer's bytes-in / bytes-out rate, and alert when it drops
> to
> > zero for some time.
> >
> > Combining 1) with 2), when both happens, it is usually indicating a lost
> > connection situation.
> >
> >
> > Guozhang
> >
> >
> > On Wed, Feb 20, 2019 at 9:39 AM Matthias J. Sax <matth...@confluent.io>
> > wrote:
> >
> > > It's a known issue: https://issues.apache.org/jira/browse/KAFKA-6520
> > >
> > >
> > > On 2/20/19 3:25 AM, Javier Arias Losada wrote:
> > > > Hello Kafka users,
> > > >
> > > > working on a Kafka-Streams stateless application; we want to
> implement
> > > some
> > > > healthchecks so that whenever connection to Kafka is lost for more
> > than a
> > > > threshold, marke the instance as unhealthy, so that our cluster
> manager
> > > > (could be K8S or AWS-ECS) kills that instance and starts a new one.
> > > >
> > > > We have notice that when the consumer is running and the connection
> is
> > > > lost, it tries to reconnect and sends some logs, but we didn't find a
> > way
> > > > to programatically check or subscribe to the connection status.
> > > >
> > > > Am I missing something?
> > > > Is this an intended feature? Why?
> > > > What are the best practices for healtchecking Kafka-streams
> > applications?
> > > >
> > > > I also found that with a plain Kafka consumer, no exception is raised
> > on
> > > > lost connectivity... how could we somehow check the connection
> status?
> > > How
> > > > are other people solving this issue?
> > > >
> > > > Thank you very much.
> > > >
> > >
> > >
> >
> > --
> > -- Guozhang
> >
>


-- 
-- Guozhang

Re: Kafka streams application unaware of connection to broker lost

Reply via email to