[jira] [Resolved] (IMPALA-6383) Memory from previous row groups can accumulate in Parquet scanner

Tim Armstrong (JIRA) Mon, 15 Jan 2018 21:02:16 -0800

     [ 
https://issues.apache.org/jira/browse/IMPALA-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tim Armstrong resolved IMPALA-6383.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.12.0

IMPALA-6383: free memory after skipping parquet row groups

Before this patch, resources were only flushed after breaking out of
NextRowGroup(). This is a problem because resources can be allocated
for skipped row groups (e.g. for reading dictionaries).

Testing:
Tested in conjunction with a prototype buffer pool patch that was
DCHECKing before the change.

Added DCHECKs to the current version to ensure the streams are cleared
up as expected.

Change-Id: Ibc2f8f27c9b238be60261539f8d4be2facb57a2b
Reviewed-on:

[http://gerrit.cloudera.org:8080/9002]


Reviewed-by: Tim Armstrong <

[tarmstr...@cloudera.com|mailto:tarmstr...@cloudera.com]

>
Tested-by: Impala Public Jenkins

> Memory from previous row groups can accumulate in Parquet scanner
> -----------------------------------------------------------------
>
>                 Key: IMPALA-6383
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6383
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 2.12.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>              Labels: parquet, resource-management
>             Fix For: Impala 2.12.0
>
>
> I ran across this bug when working on porting scanners to the new buffer 
> pool. Before that the only symptom of the failures was excessive memory 
> consumption, but with the reservations they become easy-to-detect hard 
> failures.
> The problem is in HdfsParquetScanner::NextRowGroup(), which calls 
> InitColumns() on column readers, which starts scans, which allocate memory. 
> The problem is that, if the row group is skipped because of dictionary 
> predicates or some other error, the scans aren't cancelled and the I/O 
> buffers aren't releated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-6383) Memory from previous row groups can accumulate in Parquet scanner

Reply via email to