rajadayalan perumalsamy created KAFKA-7012:
----------------------------------------------
Summary: Performance issue upgrading to kafka 1.0.1 or 1.1
Key: KAFKA-7012
URL: https://issues.apache.org/jira/browse/KAFKA-7012
Project: Kafka
Issue Type: Bug
Affects Versions: 1.0.1, 1.1.0, 0.11.0.1
Reporter: rajadayalan perumalsamy
Attachments: Commit-47ee8e954-profile.png,
Commit-47ee8e954-profile2.png, Commit-f15cdbc91b-profile.png
We are trying to upgrade kafka cluster from Kafka 0.11.0.1 to Kafka 1.0.1.
After upgrading 1 node on the cluster, we notice that network threads use most
of the cpu. It is a 3 node cluster with 15k messages/sec on each node. With
Kafka 0.11.0.1 typical usage of the servers is around 50 to 60% vcpu(using less
than 1 vcpu). After upgrade we are noticing that cpu usage is high depending on
the number of network threads used. If networks threads is set to 8, then the
cpu usage is around 850%(9 vcpus) and if it is set to 4 then the cpu usage is
around 450%(5 vcpus). Using the same kafka server.properties for both.
Did further analysis with git bisect, couple of build and deploys, traced the
issue to commit 47ee8e954df62b9a79099e944ec4be29afe046f6. CPU usage is fine for
commit f15cdbc91b240e656d9a2aeb6877e94624b21f8d. But with commit
47ee8e954df62b9a79099e944ec4be29afe046f6 cpu usage has increased. Have attached
screenshots of profiling done with both the commits. Screenshot
Commit-f15cdbc91b-profile shows less cpu usage by network threads and
Screenshots Commit-47ee8e954-profile and Commit-47ee8e954-profile2 show higher
cpu usage(almost entire cpu usage) by network threads. Also noticed that
kafka.network.Processor.poll() method is invoked 10 times more with commit
47ee8e954df62b9a79099e944ec4be29afe046f6.
We need the issue to be resolved to upgrade the cluster. Please let me know if
you need any additional information.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)