[ https://issues.apache.org/jira/browse/PARQUET-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vitalii Diravka updated PARQUET-1006: ------------------------------------- Affects Version/s: 1.12.0 > ColumnChunkPageWriter uses only heap memory. > -------------------------------------------- > > Key: PARQUET-1006 > URL: https://issues.apache.org/jira/browse/PARQUET-1006 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Affects Versions: 1.8.0, 1.12.0 > Reporter: Vitalii Diravka > Assignee: Vitalii Diravka > Priority: Major > > After PARQUET-160 was resolved, ColumnChunkPageWriter started using > ConcatenatingByteArrayCollector. There are all data is collected in the List > of byte[], before writing the page. No way to use direct memory for > allocating buffers. ByteBufferAllocator is present in the > [ColumnChunkPageWriter|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java#L73] > class, but never used. > Using of java heap space in some cases can cause OOM exceptions or GC's > overhead. > ByteBufferAllocator should be used in the ConcatenatingByteArrayCollector or > OutputStream classes. -- This message was sent by Atlassian Jira (v8.3.4#803005)