Re: kafka.server.ReplicaManager error

2015-02-05 Thread svante karlsson
I believe I've had the same problem on the 0.8.2 rc2. We had a idle test cluster with unknown health status and I applied rc3 without checking if everything was ok before. Since that cluster had been doing nothing for a couple of days and the retention time was 48 hours it's reasonable to assume

kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
I have a 9-node Kafka cluster, and all of the brokers just started spouting the following error: ERROR [Replica Manager on Broker 1]: Error when processing fetch request for partition [mytopic,57] offset 0 from follower with correlation id 58166. Possible cause: Request for offset 0 but we only

Re: kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
Digging in a bit more, it appears that the down broker had likely partially failed. Thus, it was still attempting to fetch offsets that no longer exists. Does this make sense as an explanation of the above-mentioned behavior? On Thu, Feb 5, 2015 at 10:58 AM, Kyle Banker kyleban...@gmail.com

Re: kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
Dug into this a bit more, and it turns out that we lost one of our 9 brokers at the exact moment when this started happening. At the time that we lost the broker, we had no under-replicated partitions. Since the broker disappeared, we've had a fairly constant number of under replicated partitions.