We observed the exact same error. Not very clear about the root cause
although it appears to be related to leastLoadedNode implementation.
Interestingly, the problem went away by increasing the value of
reconnect.backoff.ms to 1000ms.
On 29 Apr 2015 00:32, "Ewen Cheslack-Postava" <e...@confluent.io> wrote:

> Ok, all of that makes sense. The only way to possibly recover from that
> state is either for K2 to come back up allowing the metadata refresh to
> eventually succeed or to eventually try some other node in the cluster.
> Reusing the bootstrap nodes is one possibility. Another would be for the
> client to get more metadata than is required for the topics it needs in
> order to ensure it has more nodes to use as options when looking for a node
> to fetch metadata from. I added your description to KAFKA-1843, although it
> might also make sense as a separate bug since fixing it could be considered
> incremental progress towards resolving 1843.
>
> On Tue, Apr 28, 2015 at 9:18 AM, Manikumar Reddy <ku...@nmsworks.co.in>
> wrote:
>
> > Hi Ewen,
> >
> >  Thanks for the response.  I agree with you, In some case we should use
> > bootstrap servers.
> >
> >
> > >
> > > If you have logs at debug level, are you seeing this message in between
> > the
> > > connection attempts:
> > >
> > > Give up sending metadata request since no node is available
> > >
> >
> >  Yes, this log came for couple of times.
> >
> >
> > >
> > > Also, if you let it continue running, does it recover after the
> > > metadata.max.age.ms timeout?
> > >
> >
> >  It does not reconnect.  It is continuously trying to connect with dead
> > node.
> >
> >
> > -Manikumar
> >
>
>
>
> --
> Thanks,
> Ewen
>

Reply via email to