Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20834 )
Change subject: IMPALA-12665: Adjust complete_micro_batch_ length to new scratch_batch_->capacity after ScratchTupleBatch::Reset ...................................................................... Patch Set 2: (3 comments) Thanks for investigating this issue! I have a few comments I'd like to see addressed before merging this. http://gerrit.cloudera.org:8080/#/c/20834/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20834/2//COMMIT_MSG@7 PS2, Line 7: IMPALA-12665: Adjust complete_micro_batch_ length to new scratch_batch_->capacity after ScratchTupleBatch::Reset Impala commit messages try to stick to line limits of 72 characters. Not a hard requirement, just lets us format them for easier reading in gerrit. Tends to be more readable in a terminal as well. http://gerrit.cloudera.org:8080/#/c/20834/2//COMMIT_MSG@16 PS2, Line 16: The issue can be reproduced by creating a Parquet table with many columns and inserting data into it using Hive, then querying the table with Impala. The provided bash and Hive client scripts in IMPALA-12665 create a table and populate it with data to set up the conditions that trigger the bug. It would be nice to see a query test covering this case. http://gerrit.cloudera.org:8080/#/c/20834/2/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/20834/2/be/src/exec/parquet/hdfs-parquet-scanner.cc@2367 PS2, Line 2367: complete_micro_batch_.AdjustLength(scratch_batch_->capacity); The implementation of AdjustLength can only shrink the micro batch. I'm not sure there's any guarantee that scratch_batch_->capacity in later iterations of this loop won't grow again. Have you looked into that? -- To view, visit http://gerrit.cloudera.org:8080/20834 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I966ff10ba734ed8b1b61325486de0dfcc7b58e4d Gerrit-Change-Number: 20834 Gerrit-PatchSet: 2 Gerrit-Owner: Zinway <zinway....@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com> Gerrit-Reviewer: Zinway <zinway....@gmail.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Wed, 03 Jan 2024 23:24:26 +0000 Gerrit-HasComments: Yes