Perhaps you can upgrade all brokers and then try? Thanks,
Jun On Wed, Jan 21, 2015 at 9:53 PM, Raghu Udiyar <ra...@helpshift.com> wrote: > No errors in the state-change log or the controller. Its as if the > controller never got the request for that partition. > > Regarding the upgrade, we did upgrade one of the nodes, and initiate the > replication. Here, the controller at 0.8.0 and this node at 0.8.1.1. In > this case, when we initiated the reassignment, the following error was > logged on the destination broker : > > [2015-01-21 13:36:18,101] WARN Broker 7 ignoring LeaderAndIsr request with > correlation id 837 from controller 1 epoch 44 as broker is not in assigned > replica list 5,6 for partition [test-topic,0] (state.change.logger) > > 5,6 is from where we were moving to topic to 7. 7 is the new broker at > 0.8.1.1 > > I'm guessing the controller is in a weird way. > > -- Raghu > > > On Thu, Jan 22, 2015 at 5:57 AM, Jun Rao <j...@confluent.io> wrote: > > > Any error in the controller and state-change log? Also, you may want to > > upgrade to 0.8.1, which fixed some reassignment issues. > > > > Thanks, > > > > Jun > > > > On Wed, Jan 21, 2015 at 12:38 PM, Raghu Udiyar <ra...@helpshift.com> > > wrote: > > > > > Hello, > > > > > > I have a 6 node kafka cluster (0.8.0) where partition reassignment > > doesn't > > > seem to work on a few partitions. This happens within the same, as well > > as > > > across other topics. Following is the behavior observed : > > > > > > 1. For a successful reassignment, the kafka-reassign-partitions.sh > > returns > > > success, I see the controller initiating the reassignment, and > > > the destination brokers start replica fetcher threads. > > > 2. For the unsuccessful reassignment, the tool returns success, but > there > > > is nothing in the controller logs nor the destination brokers. > > > > > > Also, for the ones that are successful, some don’t finish replication > > > correctly. I can see that the destination brokers get stuck after a few > > > thousand offsets (checked in JMX), and doesn’t move after that. The > > > controller keeps on waiting for the fetchers to complete, but never > gets > > > there. > > > > > > Anyone seen this issue before? Is there a way to reset the state of the > > > controller? or re-elect a new one? > > > > > > Thanks, > > > Raghu > > > > > >