Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16720 )
Change subject: IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate ...................................................................... Patch Set 50: (5 comments) http://gerrit.cloudera.org:8080/#/c/16720/50//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16720/50//COMMIT_MSG@68 PS50, Line 68: partition columns It will be especially useful for partitioned Iceberg tables because runtime bloom filters doesn't work with them. But they'll need some special code, because partitioned Iceberg tables are treated as non-partitioned tables most of the time. http://gerrit.cloudera.org:8080/#/c/16720/50/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/16720/50/be/src/exec/parquet/hdfs-parquet-scanner.cc@659 PS50, Line 659: raio nit: ratio http://gerrit.cloudera.org:8080/#/c/16720/50/be/src/runtime/runtime-filter-ir.cc File be/src/runtime/runtime-filter-ir.cc: http://gerrit.cloudera.org:8080/#/c/16720/50/be/src/runtime/runtime-filter-ir.cc@33 PS50, Line 33: min_max_filter_.Load() == nullptr) return true; : return min_max_filter_.Load() Probably we should only invoke Load() once and store the pointer. http://gerrit.cloudera.org:8080/#/c/16720/50/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/16720/50/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@733 PS50, Line 733: referes nit: refers http://gerrit.cloudera.org:8080/#/c/16720/50/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@734 PS50, Line 734: as it will not be : // as effective as the conjunct. How do we know the runtime efficiency here? Maybe we should still add it and let minmax threshold decide? However, if the min/max conjunct comes from an EQ predicate, then yes, we should probably not add an overlap predicate in this case. -- To view, visit http://gerrit.cloudera.org:8080/16720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691 Gerrit-Change-Number: 16720 Gerrit-PatchSet: 50 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 21 Jan 2021 15:55:15 +0000 Gerrit-HasComments: Yes