Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5052

Change subject: IMPALA-4444: Transfer row group resources to row batch on scan 
failure
......................................................................

IMPALA-4444: Transfer row group resources to row batch on scan failure

Previously, if any column reader fails in HdfsParqetScanner::AssembleRows(),
the memory pools associated with the ScratchTupleBatch will be freed. This
is problematic as ScratchTupleBatch may contain memory pools which are still
referenced by row batches shipped upstream. This is possible because memory
pools used by parquet column readers (e.g. decompressor_pool_) won't be
transferred to a ScratchTupleBatch until the data page is exhausted. So,
the memory pools of the previous data page is always attached to the
ScratchTupleBatch of the current data page. On a scan failure, it's not
necessarily safe to free the memory pool attached to the current 
ScratchTupleBatch.

This patch fixes the problem above by transferring the memory pool and other
resources associated with a row group to the current row batch in the parquet
scanner on scan failure so it can eventually be freed by upstream operators as
the row batch is consumed.

Change-Id: Id70df470e98dd96284fd176bfbb946e9637ad126
---
M be/src/exec/exec-node.cc
M be/src/exec/hdfs-parquet-scanner.cc
2 files changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/5052/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id70df470e98dd96284fd176bfbb946e9637ad126
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <k...@cloudera.com>

Reply via email to