> Hmm, that sounds like a bug. Can you paste the log of leader rebalance > here? Thanks for you suggestions. It looks like the rebalance actually happened only once soon after I started with clean cluster and data was pushed, it didn’t happen again so far, and I see the partitions leader counts on brokers did not change since then. One of the brokers was constantly showing 0 for partition leader count. Is that normal?
Also, I still see lots of below errors (~69k) going on in the logs since the restart. Is there any other reason than rebalance for these errors? [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for partition [Topic-11,7] to broker 5:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for partition [Topic-2,25] to broker 5:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for partition [Topic-2,21] to broker 5:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for partition [Topic-22,9] to broker 5:class kafka.common.NotLeaderForPartitionException (kafka.server.ReplicaFetcherThread) > Some other things to check are: > 1. The actual property name is auto.leader.rebalance.enable, not > auto.leader.rebalance. You’ve probably known this, just to double confirm. Yes > 2. In zookeeper path, can you verify /admin/preferred_replica_election > does not exist? ls /admin [delete_topics] ls /admin/preferred_replica_election Node does not exist: /admin/preferred_replica_election Thanks Zakee > On Mar 7, 2015, at 10:49 PM, Jiangjie Qin <j...@linkedin.com.INVALID> wrote: > > Hmm, that sounds like a bug. Can you paste the log of leader rebalance > here? > Some other things to check are: > 1. The actual property name is auto.leader.rebalance.enable, not > auto.leader.rebalance. You’ve probably known this, just to double confirm. > 2. In zookeeper path, can you verify /admin/preferred_replica_election > does not exist? > > Jiangjie (Becket) Qin > > On 3/7/15, 10:24 PM, "Zakee" <kzak...@netzero.net> wrote: > >> I started with clean cluster and started to push data. It still does the >> rebalance at random durations even though the auto.leader.relabalance is >> set to false. >> >> Thanks >> Zakee >> >> >> >>> On Mar 6, 2015, at 3:51 PM, Jiangjie Qin <j...@linkedin.com.INVALID> >>> wrote: >>> >>> Yes, the rebalance should not happen in that case. That is a little bit >>> strange. Could you try to launch a clean Kafka cluster with >>> auto.leader.election disabled and try push data? >>> When leader migration occurs, NotLeaderForPartition exception is >>> expected. >>> >>> Jiangjie (Becket) Qin >>> >>> >>> On 3/6/15, 3:14 PM, "Zakee" <kzak...@netzero.net> wrote: >>> >>>> Yes, Jiangjie, I do see lots of these errors "Starting preferred >>>> replica >>>> leader election for partitions” in logs. I also see lot of Produce >>>> request failure warnings in with the NotLeader Exception. >>>> >>>> I tried switching off the auto.leader.relabalance to false. I am still >>>> noticing the rebalance happening. My understanding was the rebalance >>>> will >>>> not happen when this is set to false. >>>> >>>> Thanks >>>> Zakee >>>> >>>> >>>> >>>>> On Feb 25, 2015, at 5:17 PM, Jiangjie Qin <j...@linkedin.com.INVALID> >>>>> wrote: >>>>> >>>>> I don’t think num.replica.fetchers will help in this case. Increasing >>>>> number of fetcher threads will only help in cases where you have a >>>>> large >>>>> amount of data coming into a broker and more replica fetcher threads >>>>> will >>>>> help keep up. We usually only use 1-2 for each broker. But in your >>>>> case, >>>>> it looks that leader migration cause issue. >>>>> Do you see anything else in the log? Like preferred leader election? >>>>> >>>>> Jiangjie (Becket) Qin >>>>> >>>>> On 2/25/15, 5:02 PM, "Zakee" <kzak...@netzero.net >>>>> <mailto:kzak...@netzero.net>> wrote: >>>>> >>>>>> Thanks, Jiangjie. >>>>>> >>>>>> Yes, I do see under partitions usually shooting every hour. Anythings >>>>>> that >>>>>> I could try to reduce it? >>>>>> >>>>>> How does "num.replica.fetchers" affect the replica sync? Currently >>>>>> have >>>>>> configured 7 each of 5 brokers. >>>>>> >>>>>> -Zakee >>>>>> >>>>>> On Wed, Feb 25, 2015 at 4:17 PM, Jiangjie Qin >>>>>> <j...@linkedin.com.invalid> >>>>>> wrote: >>>>>> >>>>>>> These messages are usually caused by leader migration. I think as >>>>>>> long >>>>>>> as >>>>>>> you don¹t see this lasting for ever and got a bunch of under >>>>>>> replicated >>>>>>> partitions, it should be fine. >>>>>>> >>>>>>> Jiangjie (Becket) Qin >>>>>>> >>>>>>> On 2/25/15, 4:07 PM, "Zakee" <kzak...@netzero.net> wrote: >>>>>>> >>>>>>>> Need to know if I should I be worried about this or ignore them. >>>>>>>> >>>>>>>> I see tons of these exceptions/warnings in the broker logs, not >>>>>>>> sure >>>>>>> what >>>>>>>> causes them and what could be done to fix them. >>>>>>>> >>>>>>>> ERROR [ReplicaFetcherThread-3-5], Error for partition [TestTopic] >>>>>>>> to >>>>>>>> broker >>>>>>>> 5:class kafka.common.NotLeaderForPartitionException >>>>>>>> (kafka.server.ReplicaFetcherThread) >>>>>>>> [2015-02-25 11:01:41,785] ERROR [ReplicaFetcherThread-3-5], Error >>>>>>>> for >>>>>>>> partition [TestTopic] to broker 5:class >>>>>>>> kafka.common.NotLeaderForPartitionException >>>>>>>> (kafka.server.ReplicaFetcherThread) >>>>>>>> [2015-02-25 11:01:41,785] WARN [Replica Manager on Broker 2]: Fetch >>>>>>>> request >>>>>>>> with correlation id 950084 from client ReplicaFetcherThread-1-2 on >>>>>>>> partition [TestTopic,2] failed due to Leader not local for >>>>>>>> partition >>>>>>>> [TestTopic,2] on broker 2 (kafka.server.ReplicaManager) >>>>>>>> >>>>>>>> >>>>>>>> Any ideas? >>>>>>>> >>>>>>>> -Zakee >>>>>>>> ____________________________________________________________ >>>>>>>> Next Apple Sensation >>>>>>>> 1 little-known path to big profits >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> http://thirdpartyoffers.netzero.net/TGL3231/54ee63b9e704b63b94061st0 >>>>>>>> 3v >>>>>>>> uc >>>>>>> >>>>>>> ____________________________________________________________ >>>>>>> Extended Stay America >>>>>>> Get Fantastic Amenities, low rates! Kitchen, Ample Workspace, Free >>>>>>> WIFI >>>>>>> >>>>>>> >>>>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee66f26da6f66f10ad4mp02 >>>>>>> du >>>>>>> c >>>>>>> >>>>> >>>>> >>>>> ____________________________________________________________ >>>>> Extended Stay America >>>>> Official Site. Free WIFI, Kitchens. Our best rates here, guaranteed. >>>>> http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13duc >>>>> >>>>> <http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13duc >>>>>> >>> >>> >>> ____________________________________________________________ >>> The WORST exercise for aging >>> Avoid this "healthy" exercise to look & feel 5-10 years YOUNGER >>> http://thirdpartyoffers.netzero.net/TGL3255/54fa40e98a0e640e81196mp07duc >> > > > ____________________________________________________________ > Seabourn Luxury Cruises > Receive special offers from the World's Finest Small-Ship Cruise Line! > http://thirdpartyoffers.netzero.net/TGL3255/54fbf3b0f058073b02901mp14duc