Zinway has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/20834 )
Change subject: IMPALA-12665: Adjust complete_micro_batch_ length to new scratch_batch_->capacity after ScratchTupleBatch::Reset ...................................................................... IMPALA-12665: Adjust complete_micro_batch_ length to new scratch_batch_->capacity after ScratchTupleBatch::Reset **IMPALA-12665 Description:** The issue occurs when scanning Parquet tables with a row size > 4096 bytes and a row batch size > 1024. A heap-buffer-overflow was detected by AddressSanitizer, indicating a write operation beyond the allocated buffer space. **Root Cause Analysis:** The error log by AddressSanitizer points to a heap-buffer-overflow, where memory is accessed beyond the allocated region. This occurs in the `HdfsParquetScanner` and `ScratchTupleBatch` classes when handling large rows > 4096 bytes. **Fault Reproduction:** The issue can be reproduced by creating a Parquet table with many columns, inserting data using Hive, then querying with Impala. Bash and Hive client scripts in IMPALA-12665 create a table and populate it, triggering the bug. **Technical Analysis:** `ScratchTupleBatch::Reset` recalculates `capacity` based on tuple size and fixed memory limits. When row size > 4096 bytes, `capacity` is set < 1024. `HdfsParquetScanner` incorrectly assumes `complete_micro_batch_` length of 1024, leading to overflow. **Proposed Solution:** Ensure `complete_micro_batch_` length is updated after `ScratchTupleBatch::Reset`. This prevents accessing memory outside allocated buffer, avoiding heap-buffer-overflow. Change-Id: I966ff10ba734ed8b1b61325486de0dfcc7b58e4d --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/scratch-tuple-batch.h M tests/query_test/test_queries.py 3 files changed, 52 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/20834/4 -- To view, visit http://gerrit.cloudera.org:8080/20834 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I966ff10ba734ed8b1b61325486de0dfcc7b58e4d Gerrit-Change-Number: 20834 Gerrit-PatchSet: 4 Gerrit-Owner: Zinway <zinway....@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com> Gerrit-Reviewer: Zinway <zinway....@gmail.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>