Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: WIP: IMPALA-5843: Use page index in Parquet files to skip pages
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc
File be/src/exec/parquet/parquet-bool-decoder.cc:

http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc@86
PS1, Line 86:   } else {
> I'm not sure I follow.
I am ok with the current solution, but I think that it is sub-optimal in the 
following case:
The whole page is a long literal run, and there are a few values to skip 
(num_values % 8 != 0). ParquetBoolDecoder::DecodeValue will fill its buffer at 
every 128th value, and rle_decoder_ will never become 8 bit aligned again, so 
rle_decoder_'s buffer will be also used when filling unpacked_values_.

Note that RLE handling in ParquetBoolDecoder was implemented (by me) in a 
sub-optimal way from the start to make it simpler (avoiding adding encoding as 
template parameter). So I am ok with keeping as it is and maybe optimizing both 
skipping and decoding logic in the future.



--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:50:52 +0000
Gerrit-HasComments: Yes

Reply via email to