Hi All,
We have a cluster with 6 nodes. we have meet a direct buffer memory OOM in our
prod enviroment. Default config of Kafka’s JVM is used in our cluster.
some error is like below,
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174)
at sun.nio.ch.IOUtil.read(IOUtil.java:195)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at
org.apache.kafka.common.network.PlaintextTransportLayer.read(PlaintextTransportLayer.java:108)
at
org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:97)
at
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153)
at
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134)
at org.apache.kafka.common.network.Selector.poll(Selector.java:286)
at kafka.network.Processor.run(SocketServer.scala:413)
at java.lang.Thread.run(Thread.java:745)
There are two suspicious place.
1、we have a topic have sent 100millions messages in one minute at that time,so
I think whether this is may be the cause.
2、we have over 4000 close_wait socket on our broker.
Can someone help to analyse it. Thanks in advance.