Re: Broker Exceptions

Zakee Mon, 09 Mar 2015 10:31:50 -0700

> Hmm, that sounds like a bug. Can you paste the log of leader rebalance
> here?
Thanks for you suggestions. 
It looks like the rebalance actually happened only once soon after I started 
with clean cluster and data was pushed, it didn’t happen again so far, and I 
see the partitions leader counts on brokers did not change since then. One of 
the brokers was constantly showing 0 for partition leader count. Is that normal?


Also, I still see lots of below errors (~69k) going on in the logs since the 
restart. Is there any other reason than rebalance for these errors?

[2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for partition 
[Topic-11,7] to broker 5:class kafka.common.NotLeaderForPartitionException 
(kafka.server.ReplicaFetcherThread)
[2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for partition 
[Topic-2,25] to broker 5:class kafka.common.NotLeaderForPartitionException 
(kafka.server.ReplicaFetcherThread)
[2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for partition 
[Topic-2,21] to broker 5:class kafka.common.NotLeaderForPartitionException 
(kafka.server.ReplicaFetcherThread)
[2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for partition 
[Topic-22,9] to broker 5:class kafka.common.NotLeaderForPartitionException 
(kafka.server.ReplicaFetcherThread)

> Some other things to check are:
> 1. The actual property name is auto.leader.rebalance.enable, not
> auto.leader.rebalance. You’ve probably known this, just to double confirm.
Yes 

> 2. In zookeeper path, can you verify /admin/preferred_replica_election
> does not exist?
ls /admin
[delete_topics]
ls /admin/preferred_replica_election
Node does not exist: /admin/preferred_replica_election


Thanks
Zakee



> On Mar 7, 2015, at 10:49 PM, Jiangjie Qin <j...@linkedin.com.INVALID> wrote:
> 
> Hmm, that sounds like a bug. Can you paste the log of leader rebalance
> here?
> Some other things to check are:
> 1. The actual property name is auto.leader.rebalance.enable, not
> auto.leader.rebalance. You’ve probably known this, just to double confirm.
> 2. In zookeeper path, can you verify /admin/preferred_replica_election
> does not exist?
> 
> Jiangjie (Becket) Qin
> 
> On 3/7/15, 10:24 PM, "Zakee" <kzak...@netzero.net> wrote:
> 
>> I started with  clean cluster and started to push data. It still does the
>> rebalance at random durations even though the auto.leader.relabalance is
>> set to false.
>> 
>> Thanks
>> Zakee
>> 
>> 
>> 
>>> On Mar 6, 2015, at 3:51 PM, Jiangjie Qin <j...@linkedin.com.INVALID>
>>> wrote:
>>> 
>>> Yes, the rebalance should not happen in that case. That is a little bit
>>> strange. Could you try to launch a clean Kafka cluster with
>>> auto.leader.election disabled and try push data?
>>> When leader migration occurs, NotLeaderForPartition exception is
>>> expected.
>>> 
>>> Jiangjie (Becket) Qin
>>> 
>>> 
>>> On 3/6/15, 3:14 PM, "Zakee" <kzak...@netzero.net> wrote:
>>> 
>>>> Yes, Jiangjie, I do see lots of these errors "Starting preferred
>>>> replica
>>>> leader election for partitions” in logs. I also see lot of Produce
>>>> request failure warnings in with the NotLeader Exception.
>>>> 
>>>> I tried switching off the auto.leader.relabalance to false. I am still
>>>> noticing the rebalance happening. My understanding was the rebalance
>>>> will
>>>> not happen when this is set to false.
>>>> 
>>>> Thanks
>>>> Zakee
>>>> 
>>>> 
>>>> 
>>>>> On Feb 25, 2015, at 5:17 PM, Jiangjie Qin <j...@linkedin.com.INVALID>
>>>>> wrote:
>>>>> 
>>>>> I don’t think num.replica.fetchers will help in this case. Increasing
>>>>> number of fetcher threads will only help in cases where you have a
>>>>> large
>>>>> amount of data coming into a broker and more replica fetcher threads
>>>>> will
>>>>> help keep up. We usually only use 1-2 for each broker. But in your
>>>>> case,
>>>>> it looks that leader migration cause issue.
>>>>> Do you see anything else in the log? Like preferred leader election?
>>>>> 
>>>>> Jiangjie (Becket) Qin
>>>>> 
>>>>> On 2/25/15, 5:02 PM, "Zakee" <kzak...@netzero.net
>>>>> <mailto:kzak...@netzero.net>> wrote:
>>>>> 
>>>>>> Thanks, Jiangjie.
>>>>>> 
>>>>>> Yes, I do see under partitions usually shooting every hour. Anythings
>>>>>> that
>>>>>> I could try to reduce it?
>>>>>> 
>>>>>> How does "num.replica.fetchers" affect the replica sync? Currently
>>>>>> have
>>>>>> configured 7 each of 5 brokers.
>>>>>> 
>>>>>> -Zakee
>>>>>> 
>>>>>> On Wed, Feb 25, 2015 at 4:17 PM, Jiangjie Qin
>>>>>> <j...@linkedin.com.invalid>
>>>>>> wrote:
>>>>>> 
>>>>>>> These messages are usually caused by leader migration. I think as
>>>>>>> long
>>>>>>> as
>>>>>>> you don¹t see this lasting for ever and got a bunch of under
>>>>>>> replicated
>>>>>>> partitions, it should be fine.
>>>>>>> 
>>>>>>> Jiangjie (Becket) Qin
>>>>>>> 
>>>>>>> On 2/25/15, 4:07 PM, "Zakee" <kzak...@netzero.net> wrote:
>>>>>>> 
>>>>>>>> Need to know if I should I be worried about this or ignore them.
>>>>>>>> 
>>>>>>>> I see tons of these exceptions/warnings in the broker logs, not
>>>>>>>> sure
>>>>>>> what
>>>>>>>> causes them and what could be done to fix them.
>>>>>>>> 
>>>>>>>> ERROR [ReplicaFetcherThread-3-5], Error for partition [TestTopic]
>>>>>>>> to
>>>>>>>> broker
>>>>>>>> 5:class kafka.common.NotLeaderForPartitionException
>>>>>>>> (kafka.server.ReplicaFetcherThread)
>>>>>>>> [2015-02-25 11:01:41,785] ERROR [ReplicaFetcherThread-3-5], Error
>>>>>>>> for
>>>>>>>> partition [TestTopic] to broker 5:class
>>>>>>>> kafka.common.NotLeaderForPartitionException
>>>>>>>> (kafka.server.ReplicaFetcherThread)
>>>>>>>> [2015-02-25 11:01:41,785] WARN [Replica Manager on Broker 2]: Fetch
>>>>>>>> request
>>>>>>>> with correlation id 950084 from client ReplicaFetcherThread-1-2 on
>>>>>>>> partition [TestTopic,2] failed due to Leader not local for
>>>>>>>> partition
>>>>>>>> [TestTopic,2] on broker 2 (kafka.server.ReplicaManager)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Any ideas?
>>>>>>>> 
>>>>>>>> -Zakee
>>>>>>>> ____________________________________________________________
>>>>>>>> Next Apple Sensation
>>>>>>>> 1 little-known path to big profits
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> http://thirdpartyoffers.netzero.net/TGL3231/54ee63b9e704b63b94061st0
>>>>>>>> 3v
>>>>>>>> uc
>>>>>>> 
>>>>>>> ____________________________________________________________
>>>>>>> Extended Stay America
>>>>>>> Get Fantastic Amenities, low rates! Kitchen, Ample Workspace, Free
>>>>>>> WIFI
>>>>>>> 
>>>>>>> 
>>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee66f26da6f66f10ad4mp02
>>>>>>> du
>>>>>>> c
>>>>>>> 
>>>>> 
>>>>> 
>>>>> ____________________________________________________________
>>>>> Extended Stay America
>>>>> Official Site. Free WIFI, Kitchens. Our best rates here, guaranteed.
>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13duc
>>>>> 
>>>>> <http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13duc
>>>>>> 
>>> 
>>> 
>>> ____________________________________________________________
>>> The WORST exercise for aging
>>> Avoid this &#34;healthy&#34; exercise to look & feel 5-10 years YOUNGER
>>> http://thirdpartyoffers.netzero.net/TGL3255/54fa40e98a0e640e81196mp07duc
>> 
> 
> 
> ____________________________________________________________
> Seabourn Luxury Cruises
> Receive special offers from the World&#39;s Finest Small-Ship Cruise Line!
> http://thirdpartyoffers.netzero.net/TGL3255/54fbf3b0f058073b02901mp14duc

Re: Broker Exceptions

Reply via email to