I believe I've had the same problem on the 0.8.2 rc2. We had a idle test
cluster with unknown health status and I applied rc3 without checking if
everything was ok before. Since that cluster had been doing nothing for a
couple of days and the retention time was 48 hours it's reasonable to
assume
I have a 9-node Kafka cluster, and all of the brokers just started spouting
the following error:
ERROR [Replica Manager on Broker 1]: Error when processing fetch request
for partition [mytopic,57] offset 0 from follower with correlation id
58166. Possible cause: Request for offset 0 but we only
Digging in a bit more, it appears that the down broker had likely
partially failed. Thus, it was still attempting to fetch offsets that no
longer exists. Does this make sense as an explanation of the
above-mentioned behavior?
On Thu, Feb 5, 2015 at 10:58 AM, Kyle Banker kyleban...@gmail.com
Dug into this a bit more, and it turns out that we lost one of our 9
brokers at the exact moment when this started happening. At the time that
we lost the broker, we had no under-replicated partitions. Since the broker
disappeared, we've had a fairly constant number of under replicated
partitions.