[ https://issues.apache.org/jira/browse/FLINK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022285#comment-16022285 ]
Tzu-Li (Gordon) Tai commented on FLINK-6613: -------------------------------------------- +1 to what [~rmetzger] mentioned. The Kafka consumer (0.9+) will always have as much as 2 {{ConsumerRecords}}, one in the {{Handover}} awaiting to be processed, another awaiting to be added to {{Handover}}. Regarding what you expect: "2) Read next batch of messages only when previous batch processed" --> this is already what is happening, with only a size 1 buffer in the {{Handover}}. Also, I don't think it will solve the root cause of whats causing your OOM. "1) KafkaConsumerThread read messages with total size ~1G." --> as Robert mentioned, you should be able to just directly configure the Kafka client for that, and will likely solve your problem. > OOM during reading big messages from Kafka > ------------------------------------------ > > Key: FLINK-6613 > URL: https://issues.apache.org/jira/browse/FLINK-6613 > Project: Flink > Issue Type: Bug > Components: Kafka Connector > Affects Versions: 1.2.0 > Reporter: Andrey > > Steps to reproduce: > 1) Setup Task manager with 2G heap size > 2) Setup job that reads messages from Kafka 10 (i.e. FlinkKafkaConsumer010) > 3) Send 3300 messages each 635Kb. So total size is ~2G > 4) OOM in task manager. > According to heap dump: > 1) KafkaConsumerThread read messages with total size ~1G. > 2) Pass them to the next operator using > org.apache.flink.streaming.connectors.kafka.internal.Handover > 3) Then began to read another batch of messages. > 4) Task manager was able to read next batch of ~500Mb messages until OOM. > Expected: > 1) Either have constraint like "number of messages in-flight" OR > 2) Read next batch of messages only when previous batch processed OR > 3) Any other option which will solve OOM. -- This message was sent by Atlassian JIRA (v6.3.15#6346)