[
https://issues.apache.org/jira/browse/KAFKA-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399370#comment-15399370
]
Brice Dutheil commented on KAFKA-3990:
--------------------------------------
Hi all, sorry for the delayed response I have busy with other stuff.
Yes the broker is 0.9.0.1 as well. It runs in a docker container too.
I attached the broker logs. We restarted the single instance cluster (~ 13:20),
and a few minutes later (~ 13:34) we ran the application and the app face same
problem with this big message.
This got me curious, I only looked at the server.log, however controller.log
show OOME as well right at the broker instance start :
{code}
[2016-07-29 13:20:34,366] WARN [Controller-1-to-broker-1-send-thread],
Controller 1 epoch 1 fails to send request
{controller_id=1,controller_epoch=1,partition_states=[],live_brokers=[{id=1,end_points=[{port=9091,host=dockerhost,security_protocol_type=0}]}]}
to broker Node(1, dockerhost, 9091). Reconnecting to broker.
(kafka.controller.RequestSendThread)
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at
org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:93)
at
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at
org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153)
at
org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134)
at org.apache.kafka.common.network.Selector.poll(Selector.java:286)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:256)
at
kafka.utils.NetworkClientBlockingOps$.recurse$1(NetworkClientBlockingOps.scala:128)
at
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntilFound$extension(NetworkClientBlockingOps.scala:139)
at
kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
at
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:180)
at
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:171)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
{code}
> Kafka New Producer may raise an OutOfMemoryError
> ------------------------------------------------
>
> Key: KAFKA-3990
> URL: https://issues.apache.org/jira/browse/KAFKA-3990
> Project: Kafka
> Issue Type: Bug
> Components: clients
> Affects Versions: 0.9.0.1
> Environment: Docker, Base image : CentOS
> Java 8u77
> Reporter: Brice Dutheil
> Attachments: app-producer-config.log, kafka-broker-logs.zip
>
>
> We are regularly seeing OOME errors on a kafka producer, we first saw :
> {code}
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) ~[na:1.8.0_77]
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_77]
> at
> org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:93)
> ~[kafka-clients-0.9.0.1.jar:na]
> at
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
> ~[kafka-clients-0.9.0.1.jar:na]
> at
> org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153)
> ~[kafka-clients-0.9.0.1.jar:na]
> at
> org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134)
> ~[kafka-clients-0.9.0.1.jar:na]
> at org.apache.kafka.common.network.Selector.poll(Selector.java:286)
> ~[kafka-clients-0.9.0.1.jar:na]
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:256)
> ~[kafka-clients-0.9.0.1.jar:na]
> at
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
> ~[kafka-clients-0.9.0.1.jar:na]
> at
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
> ~[kafka-clients-0.9.0.1.jar:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_77]
> {code}
> This line refer to a buffer allocation {{ByteBuffer.allocate(receiveSize)}}
> (see
> https://github.com/apache/kafka/blob/0.9.0.1/clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java#L93)
> Usually the app runs fine within 200/400 MB heap and a 64 MB Metaspace. And
> we are producing small messages 500B at most.
> Also the error don't appear on the devlopment environment, in order to
> identify the issue we tweaked the code to give us actual data of the
> allocation size, we got this stack :
> {code}
> 09:55:49.484 [auth] [kafka-producer-network-thread | producer-1] WARN
> o.a.k.c.n.NetworkReceive HEAP-ISSUE: constructor : Integer='-1', String='-1'
> 09:55:49.485 [auth] [kafka-producer-network-thread | producer-1] WARN
> o.a.k.c.n.NetworkReceive HEAP-ISSUE: method :
> NetworkReceive.readFromReadableChannel.receiveSize=1213486160
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to /tmp/tomcat.hprof ...
> Heap dump file created [69583827 bytes in 0.365 secs]
> 09:55:50.324 [auth] [kafka-producer-network-thread | producer-1] ERROR
> o.a.k.c.utils.KafkaThread Uncaught exception in kafka-producer-network-thread
> | producer-1:
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) ~[na:1.8.0_77]
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_77]
> at
> org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:93)
> ~[kafka-clients-0.9.0.1.jar:na]
> at
> org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
> ~[kafka-clients-0.9.0.1.jar:na]
> at
> org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:153)
> ~[kafka-clients-0.9.0.1.jar:na]
> at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:134)
> ~[kafka-clients-0.9.0.1.jar:na]
> at org.apache.kafka.common.network.Selector.poll(Selector.java:286)
> ~[kafka-clients-0.9.0.1.jar:na]
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:256)
> ~[kafka-clients-0.9.0.1.jar:na]
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
> ~[kafka-clients-0.9.0.1.jar:na]
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
> ~[kafka-clients-0.9.0.1.jar:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_77]
> {code}
> Notice the size to allocate {{1213486160}} ~1.2 GB. I'm not yet sure how this
> size is initialised.
> Notice as well that every time this OOME appear the {{NetworkReceive}}
> constructor at
> https://github.com/apache/kafka/blob/0.9.0.1/clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java#L49
> receive the parameters : {{maxSize=-1}}, {{source="-1"}}
> We may have missed configuration in our setup but kafka clients shouldn't
> raise an OOME. For reference the producer is initialised with :
> {code}
> Properties props = new Properties();
> props.put(BOOTSTRAP_SERVERS_CONFIG, properties.bootstrapServers);
> props.put(ACKS_CONFIG, "ONE");
> props.put(RETRIES_CONFIG, 0);
> props.put(BATCH_SIZE_CONFIG, 16384);
> props.put(LINGER_MS_CONFIG, 0);
> props.put(BUFFER_MEMORY_CONFIG, 33554432);
> props.put(REQUEST_TIMEOUT_MS_CONFIG, 1000);
> props.put(MAX_BLOCK_MS_CONFIG, 1000);
> props.put(KEY_SERIALIZER_CLASS_CONFIG,
> StringSerializer.class.getName());
> props.put(VALUE_SERIALIZER_CLASS_CONFIG,
> JSONSerializer.class.getName());
> {code}
> For reference while googling for the issue we found a similar stack trace
> with the new consumer API on the same class on the ATLAS project:
> https://issues.apache.org/jira/browse/ATLAS-665
> If anything is missing please reach out.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)