We are trying to see what might have caused it.

We had some questions :
1) Is this reproducible? That way we can dig deep.


This looks interesting problem to solve and you might have caught a bug,
but we need to verify the root cause before filing a ticket.

Thanks,

Mayuresh

On Tue, Mar 17, 2015 at 2:10 PM, Zakee <kzak...@netzero.net> wrote:

> > What version are you running ?
>
> Version 0.8.2.0
>
> > Your case is 2). But the only thing weird is your replica (broker 3) is
> > requesting for offset which is greater than the leaders log end offset.
>
>
> So what could be the cause?
>
> Thanks
> Zakee
>
>
>
> > On Mar 17, 2015, at 11:45 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
> >
> > What version are you running ?
> >
> > The code for latest version says that :
> >
> > 1) if the log end offset of the replica is greater than the leaders log
> end
> > offset, the replicas offset will be reset to logEndOffset of the leader.
> >
> > 2) Else if the log end offset of the replica is smaller than the leaders
> > log end offset and its out of range, the replicas offset will be reset to
> > logStartOffset of the leader.
> >
> > Your case is 2). But the only thing weird is your replica (broker 3) is
> > requesting for offset which is greater than the leaders log end offset.
> >
> > Thanks,
> >
> > Mayuresh
> >
> >
> > On Tue, Mar 17, 2015 at 10:26 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com <mailto:gharatmayures...@gmail.com>> wrote:
> >
> >> cool.
> >>
> >> On Tue, Mar 17, 2015 at 10:15 AM, Zakee <kzak...@netzero.net> wrote:
> >>
> >>> Hi Mayuresh,
> >>>
> >>> The logs are already attached and are in reverse order starting
> backwards
> >>> from [2015-03-14 07:46:52,517] to the time when brokers were started.
> >>>
> >>> Thanks
> >>> Zakee
> >>>
> >>>
> >>>
> >>>> On Mar 17, 2015, at 12:07 AM, Mayuresh Gharat <
> >>> gharatmayures...@gmail.com> wrote:
> >>>>
> >>>> Hi Zakee,
> >>>>
> >>>> Thanks for the logs. Can you paste earlier logs from broker-3 up to :
> >>>>
> >>>> [2015-03-14 07:46:52,517] ERROR [ReplicaFetcherThread-2-4], Current
> >>>> offset 1754769769 for partition [Topic22kv,5] out of range; reset
> >>>> offset to 1400864851 (kafka.server.ReplicaFetcherThread)
> >>>>
> >>>> That would help us figure out what was happening on this broker before
> >>> it
> >>>> issued a replicaFetch request to broker-4.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Mayuresh
> >>>>
> >>>> On Mon, Mar 16, 2015 at 11:32 PM, Zakee <kzak...@netzero.net> wrote:
> >>>>
> >>>>> Hi Mayuresh,
> >>>>>
> >>>>> Here are the logs.
> >>>>>
> >>>>> ____________________________________________________________
> >>>>> Old School Yearbook Pics
> >>>>> View Class Yearbooks Online Free. Search by School & Year. Look Now!
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3231/5507ca8137dc94a805e6bst01vuc
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>> Kazim Zakee
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Mar 16, 2015, at 10:48 AM, Mayuresh Gharat <
> >>>>> gharatmayures...@gmail.com> wrote:
> >>>>>>
> >>>>>> Can you provide more logs (complete) on Broker 3 till time :
> >>>>>>
> >>>>>> *[2015-03-14 07:46:52,517*] WARN [ReplicaFetcherThread-2-4],
> Replica 3
> >>>>> for
> >>>>>> partition [Topic22kv,5] reset its fetch offset from 1400864851 to
> >>> current
> >>>>>> leader 4's start offset 1400864851
> (kafka.server.ReplicaFetcherThread)
> >>>>>>
> >>>>>> I would like to see logs from time much before it sent the fetch
> >>> request
> >>>>> to
> >>>>>> Broker 4 to the time above. I want to check if in any case Broker 3
> >>> was a
> >>>>>> leader before broker 4 took over.
> >>>>>>
> >>>>>> Additional logs will help.
> >>>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Mayuresh
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Sat, Mar 14, 2015 at 8:35 PM, Zakee <kzak...@netzero.net> wrote:
> >>>>>>
> >>>>>>> log.cleanup.policy is delete not compact.
> >>>>>>> log.cleaner.enable=true
> >>>>>>> log.cleaner.threads=5
> >>>>>>> log.cleanup.policy=delete
> >>>>>>> log.flush.scheduler.interval.ms=3000
> >>>>>>> log.retention.minutes=1440
> >>>>>>> log.segment.bytes=1073741824  (1gb)
> >>>>>>>
> >>>>>>> Messages are keyed but not compressed, producer async and uses
> kafka
> >>>>>>> default partitioner.
> >>>>>>> String message = msg.getString();
> >>>>>>> String uniqKey = ""+rnd.nextInt();// random key
> >>>>>>> String partKey = getPartitionKey();// partition key
> >>>>>>> KeyedMessage<String, String> data = new KeyedMessage<String,
> >>>>>>> String>(this.topicName, uniqKey, partKey, message);
> >>>>>>> producer.send(data);
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Zakee
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Mar 14, 2015, at 4:23 PM, gharatmayures...@gmail.com wrote:
> >>>>>>>>
> >>>>>>>> Is your topic log compacted? Also if it is are the messages keyed?
> >>> Or
> >>>>>>> are the messages compressed?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Mayuresh
> >>>>>>>>
> >>>>>>>> Sent from my iPhone
> >>>>>>>>
> >>>>>>>>> On Mar 14, 2015, at 2:02 PM, Zakee <kzak...@netzero.net <mailto:
> >>>>>>> kzak...@netzero.net>> wrote:
> >>>>>>>>>
> >>>>>>>>> Thanks, Jiangjie for helping resolve the kafka controller
> migration
> >>>>>>> driven partition leader rebalance issue. The logs are much cleaner
> >>> now.
> >>>>>>>>>
> >>>>>>>>> There are a few incidences of Out of range offset even though
> >>> there
> >>>>> is
> >>>>>>> no consumers running, only producers and replica fetchers. I was
> >>> trying
> >>>>> to
> >>>>>>> relate to a cause, looks like compaction (log segment deletion)
> >>> causing
> >>>>>>> this. Not sure whether this is expected behavior.
> >>>>>>>>>
> >>>>>>>>> Broker-4:
> >>>>>>>>> [2015-03-14 07:46:52,338] ERROR [Replica Manager on Broker 4]:
> >>> Error
> >>>>>>> when processing fetch request for partition [Topic22kv,5] offset
> >>>>> 1754769769
> >>>>>>> from follower with correlation id 1645671. Possible cause: Request
> >>> for
> >>>>>>> offset 1754769769 but we only have log segments in the range
> >>> 1400864851
> >>>>> to
> >>>>>>> 1754769732. (kafka.server.ReplicaManager)
> >>>>>>>>>
> >>>>>>>>> Broker-3:
> >>>>>>>>> [2015-03-14 07:46:52,356] INFO The cleaning for partition
> >>>>> [Topic22kv,5]
> >>>>>>> is aborted and paused (kafka.log.LogCleaner)
> >>>>>>>>> [2015-03-14 07:46:52,408] INFO Scheduling log segment 1400864851
> >>> for
> >>>>>>> log Topic22kv-5 for deletion. (kafka.log.Log)
> >>>>>>>>> …
> >>>>>>>>> [2015-03-14 07:46:52,421] INFO Compaction for partition
> >>> [Topic22kv,5]
> >>>>>>> is resumed (kafka.log.LogCleaner)
> >>>>>>>>> [2015-03-14 07:46:52,517] ERROR [ReplicaFetcherThread-2-4],
> Current
> >>>>>>> offset 1754769769 for partition [Topic22kv,5] out of range; reset
> >>>>> offset to
> >>>>>>> 1400864851 (kafka.server.ReplicaFetcherThread)
> >>>>>>>>> [2015-03-14 07:46:52,517] WARN [ReplicaFetcherThread-2-4],
> Replica
> >>> 3
> >>>>>>> for partition [Topic22kv,5] reset its fetch offset from 1400864851
> to
> >>>>>>> current leader 4's start offset 1400864851
> >>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>
> >>>>>>>>> ____________________________________________________________
> >>>>>>>>> Old School Yearbook Pics
> >>>>>>>>> View Class Yearbooks Online Free. Search by School & Year. Look
> >>> Now!
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3231/5504a2032e49422021991st02vuc
> >>> <
> >>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3231/5504a2032e49422021991st02vuc>
> >>>>>>>>> <topic22kv_746a_314_logs.txt>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks
> >>>>>>>>> Zakee
> >>>>>>>>>
> >>>>>>>>>> On Mar 9, 2015, at 12:18 PM, Zakee <kzak...@netzero.net> wrote:
> >>>>>>>>>>
> >>>>>>>>>> No broker restarts.
> >>>>>>>>>>
> >>>>>>>>>> Created a kafka issue:
> >>>>>>> https://issues.apache.org/jira/browse/KAFKA-2011 <
> >>>>>>> https://issues.apache.org/jira/browse/KAFKA-2011>
> >>>>>>>>>>
> >>>>>>>>>>>> Logs for rebalance:
> >>>>>>>>>>>> [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming
> >>> preferred
> >>>>>>> replica election for partitions: (kafka.controller.KafkaController)
> >>>>>>>>>>>> [2015-03-07 16:52:48,969] INFO [Controller 2]: Partitions that
> >>>>>>> completed preferred replica election:
> >>> (kafka.controller.KafkaController)
> >>>>>>>>>>>> …
> >>>>>>>>>>>> [2015-03-07 12:07:06,783] INFO [Controller 4]: Resuming
> >>> preferred
> >>>>>>> replica election for partitions: (kafka.controller.KafkaController)
> >>>>>>>>>>>> ...
> >>>>>>>>>>>> [2015-03-07 09:10:41,850] INFO [Controller 3]: Resuming
> >>> preferred
> >>>>>>> replica election for partitions: (kafka.controller.KafkaController)
> >>>>>>>>>>>> ...
> >>>>>>>>>>>> [2015-03-07 08:26:56,396] INFO [Controller 1]: Starting
> >>> preferred
> >>>>>>> replica leader election for partitions
> >>>>> (kafka.controller.KafkaController)
> >>>>>>>>>>>> ...
> >>>>>>>>>>>> [2015-03-06 16:52:59,506] INFO [Controller 2]: Partitions
> >>>>> undergoing
> >>>>>>> preferred replica election:  (kafka.controller.KafkaController)
> >>>>>>>>>>>>
> >>>>>>>>>>>> Also, I still see lots of below errors (~69k) going on in the
> >>> logs
> >>>>>>> since the restart. Is there any other reason than rebalance for
> these
> >>>>>>> errors?
> >>>>>>>>>>>>
> >>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5],
> >>> Error
> >>>>>>> for partition [Topic-11,7] to broker 5:class
> >>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5],
> >>> Error
> >>>>>>> for partition [Topic-2,25] to broker 5:class
> >>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5],
> >>> Error
> >>>>>>> for partition [Topic-2,21] to broker 5:class
> >>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5],
> >>> Error
> >>>>>>> for partition [Topic-22,9] to broker 5:class
> >>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Could you paste the related logs in controller.log?
> >>>>>>>>>> What specifically should I search for in the logs?
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Zakee
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Mar 9, 2015, at 11:35 AM, Jiangjie Qin
> >>> <j...@linkedin.com.INVALID
> >>>>>>> <mailto:j...@linkedin.com.INVALID>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Is there anything wrong with brokers around that time? E.g.
> >>> Broker
> >>>>>>> restart?
> >>>>>>>>>>> The log you pasted are actually from replica fetchers. Could
> you
> >>>>>>> paste the
> >>>>>>>>>>> related logs in controller.log?
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks.
> >>>>>>>>>>>
> >>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>
> >>>>>>>>>>>> On 3/9/15, 10:32 AM, "Zakee" <kzak...@netzero.net <mailto:
> >>>>>>> kzak...@netzero.net>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Correction: Actually  the rebalance happened quite until 24
> >>> hours
> >>>>>>> after
> >>>>>>>>>>>> the start, and thats where below errors were found. Ideally
> >>>>> rebalance
> >>>>>>>>>>>> should not have happened at all.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks
> >>>>>>>>>>>> Zakee
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mar 9, 2015, at 10:28 AM, Zakee <kzak...@netzero.net
> >>> <mailto:
> >>>>>>> kzak...@netzero.net>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hmm, that sounds like a bug. Can you paste the log of leader
> >>>>>>> rebalance
> >>>>>>>>>>>>>> here?
> >>>>>>>>>>>>> Thanks for you suggestions.
> >>>>>>>>>>>>> It looks like the rebalance actually happened only once soon
> >>>>> after I
> >>>>>>>>>>>>> started with clean cluster and data was pushed, it didn’t
> >>> happen
> >>>>>>> again
> >>>>>>>>>>>>> so far, and I see the partitions leader counts on brokers did
> >>> not
> >>>>>>> change
> >>>>>>>>>>>>> since then. One of the brokers was constantly showing 0 for
> >>>>>>> partition
> >>>>>>>>>>>>> leader count. Is that normal?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Also, I still see lots of below errors (~69k) going on in the
> >>> logs
> >>>>>>>>>>>>> since the restart. Is there any other reason than rebalance
> for
> >>>>>>> these
> >>>>>>>>>>>>> errors?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5],
> >>> Error
> >>>>>>> for
> >>>>>>>>>>>>> partition [Topic-11,7] to broker 5:class
> >>>>>>>>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5],
> >>> Error
> >>>>>>> for
> >>>>>>>>>>>>> partition [Topic-2,25] to broker 5:class
> >>>>>>>>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5],
> >>> Error
> >>>>>>> for
> >>>>>>>>>>>>> partition [Topic-2,21] to broker 5:class
> >>>>>>>>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5],
> >>> Error
> >>>>>>> for
> >>>>>>>>>>>>> partition [Topic-22,9] to broker 5:class
> >>>>>>>>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Some other things to check are:
> >>>>>>>>>>>>>> 1. The actual property name is auto.leader.rebalance.enable,
> >>> not
> >>>>>>>>>>>>>> auto.leader.rebalance. You’ve probably known this, just to
> >>> double
> >>>>>>>>>>>>>> confirm.
> >>>>>>>>>>>>> Yes
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2. In zookeeper path, can you verify
> >>>>>>> /admin/preferred_replica_election
> >>>>>>>>>>>>>> does not exist?
> >>>>>>>>>>>>> ls /admin
> >>>>>>>>>>>>> [delete_topics]
> >>>>>>>>>>>>> ls /admin/preferred_replica_election
> >>>>>>>>>>>>> Node does not exist: /admin/preferred_replica_election
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>> Zakee
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mar 7, 2015, at 10:49 PM, Jiangjie Qin
> >>>>>>> <j...@linkedin.com.INVALID <mailto:j...@linkedin.com.INVALID>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hmm, that sounds like a bug. Can you paste the log of leader
> >>>>>>> rebalance
> >>>>>>>>>>>>>> here?
> >>>>>>>>>>>>>> Some other things to check are:
> >>>>>>>>>>>>>> 1. The actual property name is auto.leader.rebalance.enable,
> >>> not
> >>>>>>>>>>>>>> auto.leader.rebalance. You’ve probably known this, just to
> >>> double
> >>>>>>>>>>>>>> confirm.
> >>>>>>>>>>>>>> 2. In zookeeper path, can you verify
> >>>>>>> /admin/preferred_replica_election
> >>>>>>>>>>>>>> does not exist?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 3/7/15, 10:24 PM, "Zakee" <kzak...@netzero.net <mailto:
> >>>>>>> kzak...@netzero.net>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I started with  clean cluster and started to push data. It
> >>> still
> >>>>>>> does
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> rebalance at random durations even though the
> >>>>>>> auto.leader.relabalance
> >>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>> set to false.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>> Zakee
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Mar 6, 2015, at 3:51 PM, Jiangjie Qin
> >>>>>>> <j...@linkedin.com.INVALID <mailto:j...@linkedin.com.INVALID>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Yes, the rebalance should not happen in that case. That
> is a
> >>>>>>> little
> >>>>>>>>>>>>>>>> bit
> >>>>>>>>>>>>>>>> strange. Could you try to launch a clean Kafka cluster
> with
> >>>>>>>>>>>>>>>> auto.leader.election disabled and try push data?
> >>>>>>>>>>>>>>>> When leader migration occurs, NotLeaderForPartition
> >>> exception
> >>>>> is
> >>>>>>>>>>>>>>>> expected.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On 3/6/15, 3:14 PM, "Zakee" <kzak...@netzero.net
> <mailto:
> >>>>>>> kzak...@netzero.net>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Yes, Jiangjie, I do see lots of these errors "Starting
> >>>>> preferred
> >>>>>>>>>>>>>>>>> replica
> >>>>>>>>>>>>>>>>> leader election for partitions” in logs. I also see lot
> of
> >>>>>>> Produce
> >>>>>>>>>>>>>>>>> request failure warnings in with the NotLeader Exception.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I tried switching off the auto.leader.relabalance to
> >>> false. I
> >>>>> am
> >>>>>>>>>>>>>>>>> still
> >>>>>>>>>>>>>>>>> noticing the rebalance happening. My understanding was
> the
> >>>>>>> rebalance
> >>>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>> not happen when this is set to false.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks
> >>>>>>>>>>>>>>>>> Zakee
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Feb 25, 2015, at 5:17 PM, Jiangjie Qin
> >>>>>>>>>>>>>>>>>> <j...@linkedin.com.INVALID <mailto:
> >>> j...@linkedin.com.INVALID
> >>>>>>>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I don’t think num.replica.fetchers will help in this
> case.
> >>>>>>>>>>>>>>>>>> Increasing
> >>>>>>>>>>>>>>>>>> number of fetcher threads will only help in cases where
> >>> you
> >>>>>>> have a
> >>>>>>>>>>>>>>>>>> large
> >>>>>>>>>>>>>>>>>> amount of data coming into a broker and more replica
> >>> fetcher
> >>>>>>>>>>>>>>>>>> threads
> >>>>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>> help keep up. We usually only use 1-2 for each broker.
> >>> But in
> >>>>>>> your
> >>>>>>>>>>>>>>>>>> case,
> >>>>>>>>>>>>>>>>>> it looks that leader migration cause issue.
> >>>>>>>>>>>>>>>>>> Do you see anything else in the log? Like preferred
> leader
> >>>>>>>>>>>>>>>>>> election?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 2/25/15, 5:02 PM, "Zakee" <kzak...@netzero.net
> >>> <mailto:
> >>>>>>> kzak...@netzero.net>
> >>>>>>>>>>>>>>>>>> <mailto:kzak...@netzero.net <mailto:kzak...@netzero.net
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks, Jiangjie.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Yes, I do see under partitions usually shooting every
> >>> hour.
> >>>>>>>>>>>>>>>>>>> Anythings
> >>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>> I could try to reduce it?
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> How does "num.replica.fetchers" affect the replica
> sync?
> >>>>>>> Currently
> >>>>>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>>>> configured 7 each of 5 brokers.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -Zakee
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Wed, Feb 25, 2015 at 4:17 PM, Jiangjie Qin
> >>>>>>>>>>>>>>>>>>> <j...@linkedin.com.invalid <mailto:
> >>>>> j...@linkedin.com.invalid
> >>>>>>>>>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> These messages are usually caused by leader
> migration. I
> >>>>>>> think as
> >>>>>>>>>>>>>>>>>>>> long
> >>>>>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>> you don¹t see this lasting for ever and got a bunch of
> >>>>> under
> >>>>>>>>>>>>>>>>>>>> replicated
> >>>>>>>>>>>>>>>>>>>> partitions, it should be fine.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On 2/25/15, 4:07 PM, "Zakee" <kzak...@netzero.net
> >>>>> <mailto:
> >>>>>>> kzak...@netzero.net>> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Need to know if I should I be worried about this or
> >>> ignore
> >>>>>>> them.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I see tons of these exceptions/warnings in the broker
> >>>>> logs,
> >>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> sure
> >>>>>>>>>>>>>>>>>>>> what
> >>>>>>>>>>>>>>>>>>>>> causes them and what could be done to fix them.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> ERROR [ReplicaFetcherThread-3-5], Error for partition
> >>>>>>>>>>>>>>>>>>>>> [TestTopic]
> >>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> broker
> >>>>>>>>>>>>>>>>>>>>> 5:class kafka.common.NotLeaderForPartitionException
> >>>>>>>>>>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>>>>>>>>>>> [2015-02-25 11:01:41,785] ERROR
> >>>>> [ReplicaFetcherThread-3-5],
> >>>>>>>>>>>>>>>>>>>>> Error
> >>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>> partition [TestTopic] to broker 5:class
> >>>>>>>>>>>>>>>>>>>>> kafka.common.NotLeaderForPartitionException
> >>>>>>>>>>>>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
> >>>>>>>>>>>>>>>>>>>>> [2015-02-25 11:01:41,785] WARN [Replica Manager on
> >>> Broker
> >>>>>>> 2]:
> >>>>>>>>>>>>>>>>>>>>> Fetch
> >>>>>>>>>>>>>>>>>>>>> request
> >>>>>>>>>>>>>>>>>>>>> with correlation id 950084 from client
> >>>>>>> ReplicaFetcherThread-1-2
> >>>>>>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>>> partition [TestTopic,2] failed due to Leader not
> local
> >>> for
> >>>>>>>>>>>>>>>>>>>>> partition
> >>>>>>>>>>>>>>>>>>>>> [TestTopic,2] on broker 2
> (kafka.server.ReplicaManager)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Any ideas?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> -Zakee
> >>>>>>>>>>>>>>>>>>>>>
> >>>>> ____________________________________________________________
> >>>>>>>>>>>>>>>>>>>>> Next Apple Sensation
> >>>>>>>>>>>>>>>>>>>>> 1 little-known path to big profits
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>> http://thirdpartyoffers.netzero.net/TGL3231/54ee63b9e704b63b94061
> <
> >>>>>>> http://thirdpartyoffers.netzero.net/TGL3231/54ee63b9e704b63b94061>
> >>>>>>>>>>>>>>>>>>>>> st0
> >>>>>>>>>>>>>>>>>>>>> 3v
> >>>>>>>>>>>>>>>>>>>>> uc
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>> ____________________________________________________________
> >>>>>>>>>>>>>>>>>>>> Extended Stay America
> >>>>>>>>>>>>>>>>>>>> Get Fantastic Amenities, low rates! Kitchen, Ample
> >>>>> Workspace,
> >>>>>>>>>>>>>>>>>>>> Free
> >>>>>>>>>>>>>>>>>>>> WIFI
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee66f26da6f66f10ad4m
> <
> >>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee66f26da6f66f10ad4m
> >
> >>>>>>>>>>>>>>>>>>>> p02
> >>>>>>>>>>>>>>>>>>>> du
> >>>>>>>>>>>>>>>>>>>> c
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>> ____________________________________________________________
> >>>>>>>>>>>>>>>>>> Extended Stay America
> >>>>>>>>>>>>>>>>>> Official Site. Free WIFI, Kitchens. Our best rates here,
> >>>>>>>>>>>>>>>>>> guaranteed.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>
> http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13d
> >>> <
> >>>>>>>
> http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13d
> >>>>
> >>>>>>>>>>>>>>>>>> uc
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> <
> >>>>>>>
> http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13
> >>>>>>>>>>>>>>>>>> duc
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> ____________________________________________________________
> >>>>>>>>>>>>>>>> The WORST exercise for aging
> >>>>>>>>>>>>>>>> Avoid this &#34;healthy&#34; exercise to look & feel 5-10
> >>> years
> >>>>>>>>>>>>>>>> YOUNGER
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>
> >>> http://thirdpartyoffers.netzero.net/TGL3255/54fa40e98a0e640e81196mp07d
> >>>>> <
> >>>>>>>
> >>> http://thirdpartyoffers.netzero.net/TGL3255/54fa40e98a0e640e81196mp07d
> >
> >>>>>>>>>>>>>>>> uc
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ____________________________________________________________
> >>>>>>>>>>>>>> Seabourn Luxury Cruises
> >>>>>>>>>>>>>> Receive special offers from the World&#39;s Finest
> Small-Ship
> >>>>>>> Cruise
> >>>>>>>>>>>>>> Line!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/54fbf3b0f058073b02901mp14duc
> >>> <
> >>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/54fbf3b0f058073b02901mp14duc>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> ____________________________________________________________
> >>>>>>>>>>> Discover Seabourn
> >>>>>>>>>>> A journey as beautiful as the destination, request a brochure
> >>> today!
> >>>>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/54fdebfe6a2a36bfb0bb3mp10duc
> >>> <
> >>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/54fdebfe6a2a36bfb0bb3mp10duc>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thanks
> >>>>>>>>>> Zakee
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ____________________________________________________________
> >>>>>>>>>> Want to place your ad here?
> >>>>>>>>>> Advertise on United Online
> >>>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/54fdf80bc575a780b0397mp05duc
> >>>>>>>>>
> >>>>>>>> ____________________________________________________________
> >>>>>>>> What's your flood risk?
> >>>>>>>> Find flood maps, interactive tools, FAQs, and agents in your area.
> >>>>>>>>
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/5504cccfca43a4ccf0a56mp08duc
> >>>>>>> <
> >>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/5504cccfca43a4ccf0a56mp08duc>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> -Regards,
> >>>>>> Mayuresh R. Gharat
> >>>>>> (862) 250-7125
> >>>>>> ____________________________________________________________
> >>>>>> What's your flood risk?
> >>>>>> Find flood maps, interactive tools, FAQs, and agents in your area.
> >>>>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/55072125266de21244da8mp12duc
> >>>>>
> >>>>> Thanks
> >>>>> Zakee
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> -Regards,
> >>>> Mayuresh R. Gharat
> >>>> (862) 250-7125
> >>>> ____________________________________________________________
> >>>> High School Yearbooks
> >>>> View Class Yearbooks Online Free. Reminisce & Buy a Reprint Today!
> >>>>
> >>>
> http://thirdpartyoffers.netzero.net/TGL3255/5507e24f3050f624f0e4amp01duc
> >>>
> >>>
> >>
> >>
> >> --
> >> -Regards,
> >> Mayuresh R. Gharat
> >> (862) 250-7125
> >>
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> > ____________________________________________________________
> > What's your flood risk?
> > Find flood maps, interactive tools, FAQs, and agents in your area.
> > http://thirdpartyoffers.netzero.net/TGL3255/5508867f356467f4946mp08duc <
> http://thirdpartyoffers.netzero.net/TGL3255/5508867f356467f4946mp08duc>
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Reply via email to