Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18017 )
Change subject: IMPALA-10910, IMPALA-5509: Runtime filter: dictionary filter support ...................................................................... Patch Set 7: (8 comments) The code looks good to me, only had some comments about comments and testing. http://gerrit.cloudera.org:8080/#/c/18017/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18017/7//COMMIT_MSG@12 PS7, Line 12: , unneeded comma http://gerrit.cloudera.org:8080/#/c/18017/7//COMMIT_MSG@18 PS7, Line 18: small dictionaries You could mention the constant that defines the limit and its current value (10). http://gerrit.cloudera.org:8080/#/c/18017/7/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18017/7/be/src/exec/parquet/hdfs-parquet-scanner.cc@1900 PS7, Line 1900: the conjuncts the conjuncts and runtime filters http://gerrit.cloudera.org:8080/#/c/18017/7/be/src/exec/parquet/hdfs-parquet-scanner.cc@1901 PS7, Line 1901: // Eliminate this row group if : // 1) no value from the dictionary filter matches. : // 2) a runtime filter does not match. How about: // Eliminate this row group if no value from the dictionary // 1) matches the dict conjuncts // AND // 2) matches all runtime filters Or maybe we can remove it, as it's a bit redundant to the above sentence. http://gerrit.cloudera.org:8080/#/c/18017/7/be/src/exec/parquet/hdfs-parquet-scanner.cc@1911 PS7, Line 1911: evaulation evaluation http://gerrit.cloudera.org:8080/#/c/18017/7/be/src/exec/parquet/hdfs-parquet-scanner.cc@1937 PS7, Line 1937: 0 Shouldn't we increase the 'processed' counter? http://gerrit.cloudera.org:8080/#/c/18017/7/testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test: http://gerrit.cloudera.org:8080/#/c/18017/7/testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test@3 PS7, Line 3: CREATE TABLE iceberg_dict_runtime_filter ( We could have a partitioned table as well. http://gerrit.cloudera.org:8080/#/c/18017/7/testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test@34 PS7, Line 34: a.col_2 = b.col_2 please add tests for more complicated JOIN conditions, e.g. a.col_2 + 1 = b.col_2 a.col_1 + a.col_2 = b.col_2 # the runtime filter should be ignored during dict filtering since it involves more columns. These will generate runtime filters with more complicated expressions. -- To view, visit http://gerrit.cloudera.org:8080/18017 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ida0ada8799774be34312eaa4be47336149f637c7 Gerrit-Change-Number: 18017 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate <tm...@cloudera.com> Gerrit-Reviewer: Amogh Margoor <amarg...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tm...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 06 Jan 2022 16:02:01 +0000 Gerrit-HasComments: Yes