Juha Mynttinen created KAFKA-17751:
--------------------------------------
Summary: Contoller high CPU when formatted with
--initial-controllers
Key: KAFKA-17751
URL: https://issues.apache.org/jira/browse/KAFKA-17751
Project: Kafka
Issue Type: Bug
Affects Versions: 3.9.0
Reporter: Juha Mynttinen
Attachments: Screenshot 2024-10-09 at 9.15.06.png, c1.properties,
c2.properties, c3.properties, c4.properties
Hey,
I'm using 3.9.0 RC0.
I noticed that formatting a simple three node controller cluster with
--initial-controllers and starting the controller leads to a situation where
the non-leader voters consume a lot of CPU.
Here are the steps to reproduce. The needed configuration files are attached.
Clean up and setup the environment.
rm -rf /tmp/controllers && \
mkdir -p /tmp/controllers/c1 && \
mkdir -p /tmp/controllers/c2 && \
mkdir -p /tmp/controllers/c3 && \
mkdir -p /tmp/controllers/c4
export KAFKA_HOME=<your_kafka_3_9_home>
Format the controllers
$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id
00000000-0000-0000-0000-000000000001 --initial-controllers
1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA
--config c1.properties
$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id
00000000-0000-0000-0000-000000000001 --initial-controllers
1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA
--config c2.properties
$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id
00000000-0000-0000-0000-000000000001 --initial-controllers
1001@localhost:10001:AAAAAAAAAAEAAAAAAAAAAA,1002@localhost:10002:AAAAAAAAAAEAAAAAAAAAAA,1003@localhost:10003:AAAAAAAAAAEAAAAAAAAAAA
--config c3.properties
Start the controllers, in separate terminals
$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c1.properties
$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c2.properties
$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c3.properties
Observe two of the controllers have CPU usage at 100%. If you check which PID
is which, you can see that it's the two processes that are voters that have
elevated CPU. The CPU usage of the leader is fine.
I did in an slightly different environment some profiling. The screenshot is
attached.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)