Our 2 node kafka cluster has become unhealthy.  We're running zookeeper as
a 3 node system, which very light load.

What seems to be happening is in the controller log we get a ZK session
expire message, and in the process of re-assigning the leader for the
partitions (if I'm understanding this right, please correct me), the broker
goes offline and it interrupts our applications that are publishing
messages.

We don't see this in production, and kafka has been stable for months,
since september.

I've searched a lot and found some similiar complaints but no real
solutions.

I'm running 0.8.2 and JVM 1.6.X on ubuntu.

Thanks for any ideas at all.

Reply via email to