Thanks a lot Prabhjot! The issue is mitigated by running the preferred replica leader election tool! Before that, I noticed that it simply could not do leader election---when I created a new topic, that topic is not available for a long time until preferred replica leader election finishes.
For the 3 steps above, 1. The replicas are evenly distributed 2. There's some imbalance in terms of the load among brokers, but not significant. But I guess there might be some brokers down and then up again---we have agent to restart it automatically. 3. Spark running in another set of machines. The kafka server's CPU/memory usage is well below 50%. On Mon, Nov 23, 2015 at 11:18 PM, Prabhjot Bharaj <prabhbha...@gmail.com> wrote: > Hi, > > With the information provided, these are the steps I can think of (based on > the experience I had with kafka):- > > 1. do a describe on the topic. See if the partitions and replicas are > evenly distributed amongst all. If not, you might want to try the 'Reassign > Partitions Tool' - > > https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-6.ReassignPartitionsTool > 2. is/are some partition(s) getting more data than others leading to an > imbalance of disk space amongst the nodes in the cluster, to an extent that > the kafka server process goes down on one or more machines in the cluster ? > 3. From what I understand, your kafka and spark machines are the same ?? !! > how much memory usage the replica-0 has when your spark cluster is running > full throttle ? > > Workaround - > Try running the Preferred Replica Leader Election Tool - > > https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-1.PreferredReplicaLeaderElectionTool > to make some replica (the one that you noticed earlier when the cluster was > all good) as the leader for this partition > > Regards, > Prabhjot > > On Tue, Nov 24, 2015 at 7:11 AM, Gwen Shapira <g...@confluent.io> wrote: > > > We fixed many many bugs since August. Since we are about to release 0.9.0 > > (with SSL!), maybe wait a day and go with a released and tested version. > > > > On Mon, Nov 23, 2015 at 3:01 PM, Qi Xu <shkir...@gmail.com> wrote: > > > > > Forgot to mention is that the Kafka version we're using is from Aug's > > > Trunk branch---which has the SSL support. > > > > > > Thanks again, > > > Qi > > > > > > > > > On Mon, Nov 23, 2015 at 2:29 PM, Qi Xu <shkir...@gmail.com> wrote: > > > > > >> Loop another guy from our team. > > >> > > >> On Mon, Nov 23, 2015 at 2:26 PM, Qi Xu <shkir...@gmail.com> wrote: > > >> > > >>> Hi folks, > > >>> We have a 10 node cluster and have several topics. Each topic has > about > > >>> 256 partitions with 3 replica factor. Now we run into an issue that > in > > some > > >>> topic, a few partition (< 10)'s leader is -1 and all of them has only > > one > > >>> synced partition. > > >>> > > >>> From the Kafka manager, here's the snapshot: > > >>> [image: Inline image 2] > > >>> > > >>> [image: Inline image 1] > > >>> > > >>> here's the state log: > > >>> [2015-11-23 21:57:58,598] ERROR Controller 1 epoch 435499 initiated > > >>> state change for partition [userlogs,84] from OnlinePartition to > > >>> OnlinePartition failed (state.change.logger) > > >>> kafka.common.StateChangeFailedException: encountered error while > > >>> electing leader for partition [userlogs,84] due to: Preferred replica > > 0 for > > >>> partition [userlogs,84] is either not alive or not in the isr. > Current > > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}]. > > >>> Caused by: kafka.common.StateChangeFailedException: Preferred > replica 0 > > >>> for partition [userlogs,84] is either not alive or not in the isr. > > Current > > >>> leader and ISR: [{"leader":-1,"leader_epoch":203,"isr":[1]}] > > >>> > > >>> My question is: > > >>> 1) how could this happen and how can I fix it or work around it? > > >>> 2) Is 256 partitions too big? We have about 200+ cores for spark > > >>> streaming job. > > >>> > > >>> Thanks, > > >>> Qi > > >>> > > >>> > > >> > > > > > > > > > -- > --------------------------------------------------------- > "There are only 10 types of people in the world: Those who understand > binary, and those who don't" >