Hello guys,

Below is the link where kafka logs can be seens with TRACE enabled.

https://drive.google.com/file/d/0B-nANlrsm5ogQkh1NUR2UHYtbkU/view?usp=sharing

I have truncated log as it was very big but it has all the cover of the
time of problem.

Scenario:
1) There were three kafka running i.e. kafka1(perhaps it is the
controller), kafka2 and kafka3. Go Sarama producer was producing.

2) Kafka3 is killed.

the log time stamp is: 00:47:26
3) At the time stamp 00:47:34, new leaders are chosen for down partitions.

4) But If you see, before 00:48:58, when client send a metadata fetch
request, kafka1 give it stale metadata in response.
But internally it use correct metadata.
At 00:48:58, Kafka receive some trigger after then it start giving correct
metadata to client.


Kindly, go through the log and revert if I am missing anything.



On Fri, Jun 3, 2016 at 6:36 AM, Christian <engr...@gmail.com> wrote:

> Hi Gerard,
>
> When trying to reproduce this, did you use the go sarama client Safique
> mentioned?
>
>
> On Fri, Jun 3, 2016 at 5:10 AM, Gerard Klijs <gerard.kl...@dizzit.com>
> wrote:
>
> > I asume you use a replication factor of 3 for the topics? When I ran some
> > test with producer/consumers in a dockerized setup, there where only few
> > failures before the producer switched to to correct new broker again. I
> > don't know the exact time, but seemed like a few seconds at max, this was
> > with with 0.9.0.0.
> >
> > On Fri, Jun 3, 2016 at 9:00 AM safique ahemad <saf.jnu...@gmail.com>
> > wrote:
> >
> > > Hi Steve,
> > >
> > > There is no way to access that from public side so I won't be able to
> do
> > > that. Sorry for that.
> > > But the step is quite simple. The only difference is that we have
> > deployed
> > > Kafka cluster using mesos url.
> > >
> > > 1) launch 3 Kafka broker cluster and create a topic with multiple
> > > partitions at least 3 so that one partition land at least on a broker.
> > > 2) launch consumer/producer client.
> > > 3) kill a broker
> > > 4) just observe the behavior of producer client
> > >
> > >
> > >
> > > On Thu, Jun 2, 2016 at 8:15 PM, Steve Tian <steve.cs.t...@gmail.com>
> > > wrote:
> > >
> > > > I see.  I'm not sure if this is a known issue.  Do you mind share the
> > > > brokers/topics setup and the steps to reproduce this issue?
> > > >
> > > > Cheers, Steve
> > > >
> > > > On Fri, Jun 3, 2016, 9:45 AM safique ahemad <saf.jnu...@gmail.com>
> > > wrote:
> > > >
> > > > > you got it right...
> > > > >
> > > > > But DialTimeout is not a concern here. Client try fetching metadata
> > > from
> > > > > Kafka brokers but Kafka give them stale metadata near 30-40 sec.
> > > > > It try to fetch 3-4 time in between until it get updated metadata.
> > > > > This is completely different problem than
> > > > > https://github.com/Shopify/sarama/issues/661
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <
> steve.cs.t...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > So you are coming from
> > https://github.com/Shopify/sarama/issues/661
> > > ,
> > > > > > right?   I'm not sure if anything from broker side can help but
> > looks
> > > > > like
> > > > > > you already found DialTimeout on client side can help?
> > > > > >
> > > > > > Cheers, Steve
> > > > > >
> > > > > > On Fri, Jun 3, 2016, 8:33 AM safique ahemad <
> saf.jnu...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > kafka version:0.9.0.0
> > > > > > > go sarama client version: 1.8
> > > > > > >
> > > > > > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <
> > > steve.cs.t...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Client version?
> > > > > > > >
> > > > > > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <
> > > saf.jnu...@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > We are using Kafka broker cluster in our data center.
> > > > > > > > > Recently, It is realized that when a Kafka broker goes down
> > > then
> > > > > > client
> > > > > > > > try
> > > > > > > > > to refresh the metadata but it get stale metadata upto near
> > 30
> > > > > > seconds.
> > > > > > > > >
> > > > > > > > > After near 30-35 seconds, updated metadata is obtained by
> > > client.
> > > > > > This
> > > > > > > is
> > > > > > > > > really a large time for the client continuously gets send
> > > failure
> > > > > for
> > > > > > > so
> > > > > > > > > long.
> > > > > > > > >
> > > > > > > > > Kindly, reply if any configuration may help here or
> something
> > > > else
> > > > > or
> > > > > > > > > required.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Safique Ahemad
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Regards,
> > > > > > > Safique Ahemad
> > > > > > > GlobalLogic | Leaders in software R&D services
> > > > > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > > > > www.globallogic.com
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Regards,
> > > > > Safique Ahemad
> > > > > GlobalLogic | Leaders in software R&D services
> > > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > > www.globallogic.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Safique Ahemad
> > > GlobalLogic | Leaders in software R&D services
> > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > www.globallogic.com
> > >
> >
>



-- 

Regards,
Safique Ahemad
GlobalLogic | Leaders in software R&D services
P :+91 120 4342000-2990 | M:+91 9953533367
www.globallogic.com

Reply via email to