[ 
https://issues.apache.org/jira/browse/FLINK-20945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576101#comment-17576101
 ] 

Robert Metzger commented on FLINK-20945:
----------------------------------------

I assume that this error happens when you are writing in a columnar data format 
like Parquet, where data needs to be buffered for each partition.
If the number of partitions is high, the memory consumption goes up.

I do believe this is still a problem.

> flink hive insert heap out of memory
> ------------------------------------
>
>                 Key: FLINK-20945
>                 URL: https://issues.apache.org/jira/browse/FLINK-20945
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Ecosystem
>         Environment: flink 1.12.0 
> hive-exec 2.3.5
>            Reporter: Bruce GAO
>            Priority: Not a Priority
>              Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> when using flink sql to insert into hive from kafka, heap out of memory 
> occrus randomly.
> Hive table using year/month/day/hour as partition,  it seems the max heap 
> space needed is corresponded to active partition number(according to kafka 
> message disordered and delay). which means if partition number increases, the 
> heap space needed also increase, may cause the heap out of memory.
> when write record, is it possible to take the whole heap space usage into 
> account in checkBlockSizeReached, or some other method to avoid OOM?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to