[ 
https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163367#comment-15163367
 ] 

Cosmin Marginean commented on KAFKA-493:
----------------------------------------

Does anyone have any updates on this or would you require any additional 
information to diagnose this?
I can confirm that a cluster of 2 Kafka nodes running for the past few weeks 
has been constantly at over 70% CPU utilisation. with a very small amount of 
messages (a few thousand a day).
What's worse, the CPU still stays very high when there's no activity on any of 
the topics.

Is there any other JIRA ticket that has followed up on this separately?

FYI, using 0.9.0.0

> High CPU usage on inactive server
> ---------------------------------
>
>                 Key: KAFKA-493
>                 URL: https://issues.apache.org/jira/browse/KAFKA-493
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.0
>            Reporter: Jay Kreps
>             Fix For: 0.10.1.0
>
>         Attachments: Kafka-2014-11-10.snapshot.zip, Kafka-sampling1.zip, 
> Kafka-sampling2.zip, Kafka-sampling3.zip, Kafka-trace1.zip, Kafka-trace2.zip, 
> Kafka-trace3.zip, backtraces.txt, stacktrace.txt
>
>
> > I've been playing with the 0.8 branch of Kafka and noticed that idle CPU 
> > usage is fairly high (13% of a 
> > core). Is that to be expected? I did look at the stack, but didn't see 
> > anything obvious. A background 
> > task?
> > I wanted to mention how I am getting into this state. I've set up two 
> > machines with the latest 0.8 
> > code base and am using a replication factor of 2. On starting the brokers 
> > there is no idle CPU activity. 
> > Then I run a test that essential does 10k publish operations followed by 
> > immediate consume operations 
> > (I was measuring latency). Once this has run the kafka nodes seem to 
> > consistently be consuming CPU 
> > essentially forever.
> hprof results:
> THREAD START (obj=500003ae, id = 200004, name="RMI TCP Accept-0", 
> group="system")
> THREAD START (obj=500003ae, id = 200005, name="RMI TCP Accept-9999", 
> group="system")
> THREAD START (obj=500003ae, id = 200006, name="RMI TCP Accept-0", 
> group="system")
> THREAD START (obj=500003ae, id = 200001, name="main", group="main")
> THREAD START (obj=500003ae, id = 200007, name="Thread-2", group="main")
> THREAD START (obj=500003ae, id = 200008, name="Thread-3", group="main")
> THREAD START (obj=500003ae, id = 200009, name="kafka-processor-9092-0", 
> group="main")
> THREAD START (obj=500003ae, id = 200010, name="kafka-processor-9092-1", 
> group="main")
> THREAD START (obj=500003ae, id = 200011, name="kafka-acceptor", group="main")
> THREAD START (obj=5000074b, id = 200012, 
> name="ZkClient-EventThread-20-localhost:2181", group="main")
> THREAD START (obj=5000076e, id = 200014, name="main-SendThread()", 
> group="main")
> THREAD START (obj=5000076d, id = 200013, name="main-EventThread", 
> group="main")
> THREAD START (obj=500003ae, id = 200015, name="metrics-meter-tick-thread-1", 
> group="main")
> THREAD START (obj=500003ae, id = 200016, name="metrics-meter-tick-thread-2", 
> group="main")
> THREAD START (obj=500003ae, id = 200017, name="request-expiration-task", 
> group="main")
> THREAD START (obj=500003ae, id = 200018, name="request-expiration-task", 
> group="main")
> THREAD START (obj=500003ae, id = 200019, name="kafka-request-handler-0", 
> group="main")
> THREAD START (obj=500003ae, id = 200020, name="kafka-request-handler-1", 
> group="main")
> THREAD START (obj=500003ae, id = 200021, name="Thread-6", group="main")
> THREAD START (obj=500003ae, id = 200022, name="Thread-7", group="main")
> THREAD START (obj=50000899, id = 200023, name="ReplicaFetcherThread-0-2 on 
> broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200024, name="ReplicaFetcherThread-0-3 on 
> broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200025, name="ReplicaFetcherThread-0-0 on 
> broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200026, name="ReplicaFetcherThread-0-1 on 
> broker 1, ", group="main")
> THREAD START (obj=500003ae, id = 200028, name="SIGINT handler", 
> group="system")
> THREAD START (obj=500003ae, id = 200029, name="Thread-5", group="main")
> THREAD START (obj=5000074b, id = 200030, name="Thread-1", group="main")
> THREAD START (obj=5000074b, id = 200031, name="Thread-0", group="main")
> THREAD END (id = 200031)
> THREAD END (id = 200029)
> THREAD END (id = 200020)
> THREAD END (id = 200019)
> THREAD END (id = 200008)
> THREAD END (id = 200021)
> THREAD END (id = 200007)
> THREAD END (id = 200022)
> THREAD END (id = 200018)
> THREAD END (id = 200017)
> THREAD END (id = 200012)
> THREAD END (id = 200013)
> THREAD END (id = 200014)
> THREAD END (id = 200025)
> THREAD END (id = 200023)
> THREAD END (id = 200026)
> THREAD END (id = 200024)
> THREAD END (id = 200011)
> THREAD END (id = 200009)
> THREAD END (id = 200010)
> THREAD END (id = 200030)
> THREAD END (id = 200028)
> TRACE 301281:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown 
> line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218)
>         sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>         
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>         kafka.utils.Utils$.read(Utils.scala:453)
>         
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>         kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>         
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>         kafka.network.BlockingChannel.receive(BlockingChannel.scala:92)
>         kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:67)
>         kafka.consumer.SimpleConsumer.sendRequest(SimpleConsumer.scala:65)
>         kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:95)
>         
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:71)
>         kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> TRACE 300427:
>         java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown 
> line)
>         
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>         java.net.ServerSocket.implAccept(ServerSocket.java:522)
>         java.net.ServerSocket.accept(ServerSocket.java:490)
>         
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
>         
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 300963:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown 
> line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         kafka.network.Processor.run(SocketServer.scala:202)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 300558:
>         java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown 
> line)
>         
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>         java.net.ServerSocket.implAccept(ServerSocket.java:522)
>         java.net.ServerSocket.accept(ServerSocket.java:490)
>         
> sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
>         
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
>         
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 300978:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown 
> line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         kafka.network.Acceptor.run(SocketServer.scala:142)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 301024:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown 
> line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1134)
> TRACE 301830:
>         sun.misc.Unsafe.park(Unsafe.java:Unknown line)
>         java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>         
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>         java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
>         
> kafka.server.RequestPurgatory$ExpiredRequestReaper.pollExpired(RequestPurgatory.scala:282)
>         
> kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:235)
>         java.lang.Thread.run(Thread.java:722)
> CPU SAMPLES BEGIN (total = 1878003) Thu Aug 30 22:26:05 2012
> rank   self  accum   count trace method
>    1 36.22% 36.22%  680139 301281 sun.nio.ch.EPollArrayWrapper.epollWait
>    2 18.15% 54.36%  340770 300427 java.net.PlainSocketImpl.socketAccept
>    3 18.13% 72.49%  340458 300963 sun.nio.ch.EPollArrayWrapper.epollWait
>    4  9.07% 81.56%  170382 300558 java.net.PlainSocketImpl.socketAccept
>    5  9.07% 90.63%  170340 300978 sun.nio.ch.EPollArrayWrapper.epollWait
>    6  9.06% 99.69%  170139 301024 sun.nio.ch.EPollArrayWrapper.epollWait
>    7  0.18% 99.88%    3429 301830 sun.misc.Unsafe.park
> CPU SAMPLES END



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to