Tom Coupland created KAFKA-4884:
-----------------------------------
Summary: __consumer_offsets topic processing consuming all
resources
Key: KAFKA-4884
URL: https://issues.apache.org/jira/browse/KAFKA-4884
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 0.10.1.0
Environment: Mesos cluster, coreos
Reporter: Tom Coupland
Since this morning it appears that the processing for the __consumer_offsets
topic is consuming all the resources in our test cluster. There are no other
messages being dispatch through other topics, yet the brokers are using all
their cpu and lot of network.
A clear sign that the problem is with the special topic, is when I deleted some
test topics (leaving three topic remaining, including __consumer_offsets) the
network load decreased somewhat, not enough to be fixed, be enough to point the
figure firmly in this direction.
The rate of offsets for the consumer-offsets topic seems overly high. I'm
summing the total offset across all 50 partitions and it's on the order of
22000 every ten seconds, dropping to 17000 when I deleted the spare test topics.
These are time-stamps and total offsets for all partitions summed from before
test topic deletion:
Fri 10 Mar 18:57:38 GMT 2017
114700933
Fri 10 Mar 18:57:56 GMT 2017
114727290
Fri 10 Mar 18:58:12 GMT 2017
114750030
Fri 10 Mar 18:58:31 GMT 2017
114776560
There is nothing in the broker logs pointing to any errors, in fact, there is
little to go on. Attempting to attach a consumer to topic just results in a
hanging process.
It feels like the topic is being looped back on itself, creating offset updates
for its own updates or something like that. I'm leaving the cluster up for the
weekend so we can continue to diagnose, but it seem's like there must be a bug
at the root of this.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)