Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: IMPALA-5843: Use page index in Parquet files to skip pages
......................................................................


Patch Set 15:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12065/15/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/12065/15/be/src/exec/parquet/hdfs-parquet-scanner.cc@1170
PS15, Line 1170:     // Scalar readers that read collections can be ahead of 
other readers by 1.
> :(
Invented BaseScalarColumnReader::LastProcessedRow(), with that I could simplify 
this check back to the original complexity.


http://gerrit.cloudera.org:8080/#/c/12065/15/be/src/exec/parquet/parquet-column-readers.h
File be/src/exec/parquet/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/12065/15/be/src/exec/parquet/parquet-column-readers.h@533
PS15, Line 533: isn't anymore
> pedantry: aren't any more
Done


http://gerrit.cloudera.org:8080/#/c/12065/7/be/src/exec/parquet/parquet-column-readers.cc
File be/src/exec/parquet/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/12065/7/be/src/exec/parquet/parquet-column-readers.cc@512
PS7, Line 512:       if (IsLastCandidateRange()) {
> Yeah exactly, the bit-packed runs in Parquet's RLE encoding can have spurio
Since then I moved these conditions to ConsumedCurrentCandidateRange() and 
removed 'num_buffered_values != 0' because it prevented jumping to the next 
page. It wouldn't produce wrong result sets, but it caused problems in 
HdfsParquetScanner::CheckPageFiltering() because the 'current rows' could get 
out-of-sync for a moment.

I guard against invalid peeking by moving the num_buffered_values check just 
before the PeekLevel() == 0 condition.



--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 15
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Pooja Nilangekar <pooja.nilange...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Tue, 23 Apr 2019 13:16:01 +0000
Gerrit-HasComments: Yes

Reply via email to