[ https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210092#comment-15210092 ]
Cosmin Marginean edited comment on KAFKA-493 at 3/24/16 11:19 AM: ------------------------------------------------------------------ I can confirm that we're been running 0.9.0.1 for over a week now and none of the CPU issues are present anymore (See 0.9.0.1-upgrade.png). I will make sure to update this if it regresses. was (Author: cosmin.marginean): I can confirm that we're been running 0.9.0.1 for over a week now and none of the CPU issues are present anymore (See 0.9.0.1-upgrade.png. I will make sure to update this if it regresses. > High CPU usage on inactive server > --------------------------------- > > Key: KAFKA-493 > URL: https://issues.apache.org/jira/browse/KAFKA-493 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.0 > Reporter: Jay Kreps > Attachments: Kafka-2014-11-10.snapshot.zip, Kafka-sampling1.zip, > Kafka-sampling2.zip, Kafka-sampling3.zip, Kafka-trace1.zip, Kafka-trace2.zip, > Kafka-trace3.zip, backtraces.txt, stacktrace.txt > > > > I've been playing with the 0.8 branch of Kafka and noticed that idle CPU > > usage is fairly high (13% of a > > core). Is that to be expected? I did look at the stack, but didn't see > > anything obvious. A background > > task? > > I wanted to mention how I am getting into this state. I've set up two > > machines with the latest 0.8 > > code base and am using a replication factor of 2. On starting the brokers > > there is no idle CPU activity. > > Then I run a test that essential does 10k publish operations followed by > > immediate consume operations > > (I was measuring latency). Once this has run the kafka nodes seem to > > consistently be consuming CPU > > essentially forever. > hprof results: > THREAD START (obj=500003ae, id = 200004, name="RMI TCP Accept-0", > group="system") > THREAD START (obj=500003ae, id = 200005, name="RMI TCP Accept-9999", > group="system") > THREAD START (obj=500003ae, id = 200006, name="RMI TCP Accept-0", > group="system") > THREAD START (obj=500003ae, id = 200001, name="main", group="main") > THREAD START (obj=500003ae, id = 200007, name="Thread-2", group="main") > THREAD START (obj=500003ae, id = 200008, name="Thread-3", group="main") > THREAD START (obj=500003ae, id = 200009, name="kafka-processor-9092-0", > group="main") > THREAD START (obj=500003ae, id = 200010, name="kafka-processor-9092-1", > group="main") > THREAD START (obj=500003ae, id = 200011, name="kafka-acceptor", group="main") > THREAD START (obj=5000074b, id = 200012, > name="ZkClient-EventThread-20-localhost:2181", group="main") > THREAD START (obj=5000076e, id = 200014, name="main-SendThread()", > group="main") > THREAD START (obj=5000076d, id = 200013, name="main-EventThread", > group="main") > THREAD START (obj=500003ae, id = 200015, name="metrics-meter-tick-thread-1", > group="main") > THREAD START (obj=500003ae, id = 200016, name="metrics-meter-tick-thread-2", > group="main") > THREAD START (obj=500003ae, id = 200017, name="request-expiration-task", > group="main") > THREAD START (obj=500003ae, id = 200018, name="request-expiration-task", > group="main") > THREAD START (obj=500003ae, id = 200019, name="kafka-request-handler-0", > group="main") > THREAD START (obj=500003ae, id = 200020, name="kafka-request-handler-1", > group="main") > THREAD START (obj=500003ae, id = 200021, name="Thread-6", group="main") > THREAD START (obj=500003ae, id = 200022, name="Thread-7", group="main") > THREAD START (obj=50000899, id = 200023, name="ReplicaFetcherThread-0-2 on > broker 1, ", group="main") > THREAD START (obj=50000899, id = 200024, name="ReplicaFetcherThread-0-3 on > broker 1, ", group="main") > THREAD START (obj=50000899, id = 200025, name="ReplicaFetcherThread-0-0 on > broker 1, ", group="main") > THREAD START (obj=50000899, id = 200026, name="ReplicaFetcherThread-0-1 on > broker 1, ", group="main") > THREAD START (obj=500003ae, id = 200028, name="SIGINT handler", > group="system") > THREAD START (obj=500003ae, id = 200029, name="Thread-5", group="main") > THREAD START (obj=5000074b, id = 200030, name="Thread-1", group="main") > THREAD START (obj=5000074b, id = 200031, name="Thread-0", group="main") > THREAD END (id = 200031) > THREAD END (id = 200029) > THREAD END (id = 200020) > THREAD END (id = 200019) > THREAD END (id = 200008) > THREAD END (id = 200021) > THREAD END (id = 200007) > THREAD END (id = 200022) > THREAD END (id = 200018) > THREAD END (id = 200017) > THREAD END (id = 200012) > THREAD END (id = 200013) > THREAD END (id = 200014) > THREAD END (id = 200025) > THREAD END (id = 200023) > THREAD END (id = 200026) > THREAD END (id = 200024) > THREAD END (id = 200011) > THREAD END (id = 200009) > THREAD END (id = 200010) > THREAD END (id = 200030) > THREAD END (id = 200028) > TRACE 301281: > sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown > line) > sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228) > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81) > sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) > sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218) > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > kafka.utils.Utils$.read(Utils.scala:453) > > kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) > kafka.network.Receive$class.readCompletely(Transmission.scala:56) > > kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) > kafka.network.BlockingChannel.receive(BlockingChannel.scala:92) > kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:67) > kafka.consumer.SimpleConsumer.sendRequest(SimpleConsumer.scala:65) > kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:95) > > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:71) > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50) > TRACE 300427: > java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown > line) > > java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) > java.net.ServerSocket.implAccept(ServerSocket.java:522) > java.net.ServerSocket.accept(ServerSocket.java:490) > > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387) > > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359) > java.lang.Thread.run(Thread.java:722) > TRACE 300963: > sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown > line) > sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228) > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81) > sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) > sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) > kafka.network.Processor.run(SocketServer.scala:202) > java.lang.Thread.run(Thread.java:722) > TRACE 300558: > java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown > line) > > java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) > java.net.ServerSocket.implAccept(ServerSocket.java:522) > java.net.ServerSocket.accept(ServerSocket.java:490) > > sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52) > > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387) > > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359) > java.lang.Thread.run(Thread.java:722) > TRACE 300978: > sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown > line) > sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228) > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81) > sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) > sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) > kafka.network.Acceptor.run(SocketServer.scala:142) > java.lang.Thread.run(Thread.java:722) > TRACE 301024: > sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown > line) > sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228) > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81) > sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) > sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1134) > TRACE 301830: > sun.misc.Unsafe.park(Unsafe.java:Unknown line) > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) > java.util.concurrent.DelayQueue.take(DelayQueue.java:220) > > kafka.server.RequestPurgatory$ExpiredRequestReaper.pollExpired(RequestPurgatory.scala:282) > > kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:235) > java.lang.Thread.run(Thread.java:722) > CPU SAMPLES BEGIN (total = 1878003) Thu Aug 30 22:26:05 2012 > rank self accum count trace method > 1 36.22% 36.22% 680139 301281 sun.nio.ch.EPollArrayWrapper.epollWait > 2 18.15% 54.36% 340770 300427 java.net.PlainSocketImpl.socketAccept > 3 18.13% 72.49% 340458 300963 sun.nio.ch.EPollArrayWrapper.epollWait > 4 9.07% 81.56% 170382 300558 java.net.PlainSocketImpl.socketAccept > 5 9.07% 90.63% 170340 300978 sun.nio.ch.EPollArrayWrapper.epollWait > 6 9.06% 99.69% 170139 301024 sun.nio.ch.EPollArrayWrapper.epollWait > 7 0.18% 99.88% 3429 301830 sun.misc.Unsafe.park > CPU SAMPLES END -- This message was sent by Atlassian JIRA (v6.3.4#6332)