[ https://issues.apache.org/jira/browse/FLINK-20945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576101#comment-17576101 ]
Robert Metzger commented on FLINK-20945: ---------------------------------------- I assume that this error happens when you are writing in a columnar data format like Parquet, where data needs to be buffered for each partition. If the number of partitions is high, the memory consumption goes up. I do believe this is still a problem. > flink hive insert heap out of memory > ------------------------------------ > > Key: FLINK-20945 > URL: https://issues.apache.org/jira/browse/FLINK-20945 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Ecosystem > Environment: flink 1.12.0 > hive-exec 2.3.5 > Reporter: Bruce GAO > Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor > > when using flink sql to insert into hive from kafka, heap out of memory > occrus randomly. > Hive table using year/month/day/hour as partition, it seems the max heap > space needed is corresponded to active partition number(according to kafka > message disordered and delay). which means if partition number increases, the > heap space needed also increase, may cause the heap out of memory. > when write record, is it possible to take the whole heap space usage into > account in checkBlockSizeReached, or some other method to avoid OOM? -- This message was sent by Atlassian Jira (v8.20.10#820010)