Hi,

Our production env uses Kafka 0.9.0.1 cluster of 12 m3.large nodes.
Partitions count per broker is ~450, percent of leaders per broker is
30-40%. The average messages load is ~3K/s, bytes flow in is ~10MB/s and
bytes flow out is ~60 MB/s.

We observed strange behaviour while putting one instance down terminating
it on AWS:

After putting down one Kafka instance, the leadership of partitions it was
a leader for was transferred to other nodes. All nodes increased their cpu
usage and one of them started consuming around 100% cpu. Restarts of that
node does not help because high cpu usage is caught up by another node.
This behaviour continues around 30 mins during that time.

In two months, we have experienced this issue several times a day.

Do you know something about that problem?
-- 

With great enthusiasm,
Andrey

Reply via email to