Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18017 )
Change subject: IMPALA-10910, IMPALA-5509: Runtime filter: dictionary filter support ...................................................................... Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/18017/5/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18017/5/be/src/exec/parquet/hdfs-parquet-scanner.cc@1916 PS5, Line 1916: if (runtime_filters->at(rf_idx)->Eval(&row)) { : column_has_match = true; : break; : } > We should break on the first filter that returns false. Isn't it the opposite? When a runtime filter is evaluated to true, we should keep the row group. My understanding is: filter1 = {1,2} filter2 = {3,4} conj = "id < 10" dict = {5, 3, 6} In this case, the conjunct is evaluated to true, and '3' is in the filter2 bloom filter, so we should keep this row group for further processing. http://gerrit.cloudera.org:8080/#/c/18017/5/be/src/exec/parquet/hdfs-parquet-scanner.cc@1931 PS5, Line 1931: runtime_filters->at(rf_idx)->stats->IncrCounters( : FilterStats::ROW_GROUPS_KEY, 1, 0, 0); > We should only increment the counters for the processed filters. Agree, will be fixed in the next patch. -- To view, visit http://gerrit.cloudera.org:8080/18017 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ida0ada8799774be34312eaa4be47336149f637c7 Gerrit-Change-Number: 18017 Gerrit-PatchSet: 5 Gerrit-Owner: Tamas Mate <tm...@cloudera.com> Gerrit-Reviewer: Amogh Margoor <amarg...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tm...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 06 Jan 2022 08:47:20 +0000 Gerrit-HasComments: Yes