[ https://issues.apache.org/jira/browse/KAFKA-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439158#comment-13439158 ]
Chris Riccomini commented on KAFKA-479: --------------------------------------- A dump of top. You can see, my proc starts many consumers, but only one is pegged. This seems consistent with running in an infinite loop in ZooKeeper's ClientCnxn.run method (1134 in ClientCnxn on 3.3.4 - http://grepcode.com/file/repo1.maven.org/maven2/org.apache.zookeeper/zookeeper/3.3.4/org/apache/zookeeper/ClientCnxn.java ). top - 16:41:30 up 28 days, 6:19, 15 users, load average: 0.66, 0.53, 0.43 Tasks: 1185 total, 4 running, 1181 sleeping, 0 stopped, 0 zombie Cpu(s): 17.5%us, 1.3%sy, 0.0%ni, 81.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 57680240k total, 15203084k used, 42477156k free, 1307288k buffers Swap: 67106808k total, 0k used, 67106808k free, 5285588k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5821 criccomi 20 0 5180m 861m 14m R 92.1 1.5 1:48.91 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 5830 criccomi 20 0 5180m 861m 14m S 5.8 1.5 0:42.55 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 4951 criccomi 20 0 1127m 281m 30m S 3.8 0.5 1624:07 /opt/google/chrome/chrome --type=plugin --plugin-path=/usr/lib64/flash-plugin/libflashplayer.so --lang=en-US --channel=4730.15.234897314 2791 root 20 0 353m 222m 14m S 2.9 0.4 148:10.84 /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-mcXc9U/database -nolisten tcp vt1 5823 criccomi 20 0 5180m 861m 14m S 2.9 1.5 0:02.15 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 5824 criccomi 20 0 5180m 861m 14m S 2.9 1.5 0:02.16 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 6165 criccomi 20 0 5180m 861m 14m S 2.9 1.5 0:02.41 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 6168 criccomi 20 0 5180m 861m 14m S 2.9 1.5 0:02.27 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 6293 criccomi 20 0 35940 3348 1764 R 2.9 0.0 0:00.20 top 5822 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.14 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 5825 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.14 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 5826 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.11 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 5827 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.10 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 5828 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.14 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 5829 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.12 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 6156 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.28 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 6157 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.33 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 6158 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.20 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ 6159 criccomi 20 0 5180m 861m 14m S 1.9 1.5 0:02.35 java -agentlib:hprof=cpu=samples,interval=20,depth=30 -Xmx512M -cp :bin/../lib/activation-1.1.jar:bin/../lib/aopalliance-1.0.jar:bin/../lib/asm-3.2.jar:bin/../lib/aspectjrt-1.6.5.jar:bin/../lib/avro-1.4.0.jar:bin/../lib/ > ZK EPoll taking 100% CPU usage with Kafka Client > ------------------------------------------------ > > Key: KAFKA-479 > URL: https://issues.apache.org/jira/browse/KAFKA-479 > Project: Kafka > Issue Type: Bug > Environment: java 1.6.0_21 > Kafka 0.7.14 > Reporter: Chris Riccomini > Priority: Blocker > Attachments: java.hprof.txt > > > Hey Guys, > I'm seeing a very strange bug after I upgraded to Kafka 0.7.14. On my > consumer side, the process seems to run with very high CPU usage. I turned on > HPROF, and it's showing that 75% of the CPU usage is going to Java's NIO > EPoll/select calls. > There are supposedly work arounds that Mina and a few others employ. This > seems like more of a ZK bug- have you guys seen this before? > Any ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira