Zinway has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/20834 )

Change subject: IMPALA-12665: Adjust complete_micro_batch_ length to new 
scratch_batch_->capacity after ScratchTupleBatch::Reset
......................................................................

IMPALA-12665: Adjust complete_micro_batch_ length to new 
scratch_batch_->capacity after ScratchTupleBatch::Reset

**IMPALA-12665 Description:**
The issue occurs when scanning Parquet tables with a row size
> 4096 bytes and a row batch size > 1024. A heap-buffer-overflow
was detected by AddressSanitizer, indicating a write operation
beyond the allocated buffer space.

**Root Cause Analysis:**
The error log by AddressSanitizer points to a heap-buffer-overflow,
where memory is accessed beyond the allocated region. This occurs
in the `HdfsParquetScanner` and `ScratchTupleBatch` classes when
handling large rows > 4096 bytes.

**Fault Reproduction:**
The issue can be reproduced by creating a Parquet table with many
columns, inserting data using Hive, then querying with Impala.
Bash and Hive client scripts in IMPALA-12665 create a table and
populate it, triggering the bug.

**Technical Analysis:**
`ScratchTupleBatch::Reset` recalculates `capacity` based on tuple
size and fixed memory limits. When row size > 4096 bytes, `capacity`
is set < 1024. `HdfsParquetScanner` incorrectly assumes
`complete_micro_batch_` length of 1024, leading to overflow.

**Proposed Solution:**
Ensure `complete_micro_batch_` length is updated after
`ScratchTupleBatch::Reset`. This prevents accessing memory outside
allocated buffer, avoiding heap-buffer-overflow.

Change-Id: I966ff10ba734ed8b1b61325486de0dfcc7b58e4d
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/scratch-tuple-batch.h
M tests/query_test/test_queries.py
3 files changed, 52 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/20834/4
--
To view, visit http://gerrit.cloudera.org:8080/20834
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I966ff10ba734ed8b1b61325486de0dfcc7b58e4d
Gerrit-Change-Number: 20834
Gerrit-PatchSet: 4
Gerrit-Owner: Zinway <zinway....@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com>
Gerrit-Reviewer: Zinway <zinway....@gmail.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to