Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12065 )
Change subject: WIP: IMPALA-5843: Use page index in Parquet files to skip pages ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc File be/src/exec/parquet/parquet-bool-decoder.cc: http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc@86 PS1, Line 86: } else { > I'm not sure I follow. I am ok with the current solution, but I think that it is sub-optimal in the following case: The whole page is a long literal run, and there are a few values to skip (num_values % 8 != 0). ParquetBoolDecoder::DecodeValue will fill its buffer at every 128th value, and rle_decoder_ will never become 8 bit aligned again, so rle_decoder_'s buffer will be also used when filling unpacked_values_. Note that RLE handling in ParquetBoolDecoder was implemented (by me) in a sub-optimal way from the start to make it simpler (avoiding adding encoding as template parameter). So I am ok with keeping as it is and maybe optimizing both skipping and decoding logic in the future. -- To view, visit http://gerrit.cloudera.org:8080/12065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a Gerrit-Change-Number: 12065 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Wed, 09 Jan 2019 17:50:52 +0000 Gerrit-HasComments: Yes