Tamas Mate has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18017 )
Change subject: IMPALA-10910, IMPALA-5509: Runtime filter: dictionary filter support ...................................................................... IMPALA-10910, IMPALA-5509: Runtime filter: dictionary filter support This commit is based on Csaba Ringhofer's earlier work on IMPALA-5509. If a runtime filter uses only a single column, then it can be used to filter Parquet dictionaries, and if all dictionary values are filtered, out, the whole row group can be skipped. This is especially useful for Iceberg tables, as the partition column is in the data file, therefore this can help eliminate unnecessary reads. The chance of false positives grow exponentially with the size of the dictionary, so this optimisation is only useful for small dictionaries. Testing: - Added e2e test that creates an Iceberg/Parquet table and queries it Change-Id: Ida0ada8799774be34312eaa4be47336149f637c7 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h A testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test A testdata/workloads/functional-query/queries/QueryTest/parquet-dictionary-runtime-filter.test M tests/query_test/test_runtime_filters.py 5 files changed, 158 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/18017/7 -- To view, visit http://gerrit.cloudera.org:8080/18017 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ida0ada8799774be34312eaa4be47336149f637c7 Gerrit-Change-Number: 18017 Gerrit-PatchSet: 7 Gerrit-Owner: Tamas Mate <tm...@cloudera.com> Gerrit-Reviewer: Amogh Margoor <amarg...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tm...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>