[ https://issues.apache.org/jira/browse/DRILL-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643926#comment-17643926 ]
ASF GitHub Bot commented on DRILL-8366: --------------------------------------- jnturton opened a new pull request, #2716: URL: https://github.com/apache/drill/pull/2716 # [DRILL-8366](https://issues.apache.org/jira/browse/DRILL-8366): Late release of compressor memory in the Parquet writer ## Description The Parquet writer waits until the end of the entire write before releasing its compression codec factory. The factory in turn releases compressors which release direct memory buffers used during compression. This deferred release leads a build up of direct memory use and can cause large write jobs to fail. The Parquet writer can instead release the abovementioned each time that a file/row group is flushed. ## Documentation N/A ## Testing Manually confirm the release of allocated compression buffers after each flush in the debug log output. Manually monitor memory usage during a big Parquet write job. > Late release of compressor memory in the Parquet writer > ------------------------------------------------------- > > Key: DRILL-8366 > URL: https://issues.apache.org/jira/browse/DRILL-8366 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet > Affects Versions: 1.20.2 > Reporter: James Turton > Assignee: James Turton > Priority: Minor > Fix For: 1.20.3 > > > The Parquet writer waits until the end of the entire write before releasing > its compression codec factory. The factory in turn releases compressors which > release direct memory buffers used during compression. This deferred release > leads a build up of direct memory use and can cause large write jobs to fail. > The Parquet writer can instead release the abovementioned each time that a > file/row group is flushed. -- This message was sent by Atlassian Jira (v8.20.10#820010)