Hello Michael Ho, Lars Volker, Matthew Jacobs, Mostafa Mokhtar, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/7793 to look at the new patch set (#5). Change subject: IMPALA-4252: Min-max runtime filters for Kudu ...................................................................... IMPALA-4252: Min-max runtime filters for Kudu This patch implements min-max filters for runtime filters. Each runtime filter generates a bloom filter and/or a min-max filter, depending on if it has HDFS and/or Kudu targets, respectively. Min-max filters are generated by the PartitionedHashJoinBuilder. For now, min-max filters are only applied at the KuduScanner, which passes them into the Kudu client. Because the Kudu client doesn't provide a way to specify generic filter exprs, min-max filters are only generated when the target expr is a bare Kudu column ref. Future work will address applying min-max filters at HDFS scan nodes and applying bloom filters at Kudu scan nodes. Codegen is used to eliminate branching on the type of the min-max filter. Testing: - Updated planner tests. - Ran existing runtime filter tests. - Ran preliminary perf tests to demonstrate that it works. Will update with more specific results. - Still needs more e2e tests. Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/hdfs-parquet-scanner-ir.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/kudu-scan-node-base.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/runtime/runtime-filter.cc M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-filter.inline.h M be/src/service/impala-internal-service.cc M be/src/util/CMakeLists.txt A be/src/util/min-max-filter-ir.cc A be/src/util/min-max-filter.cc A be/src/util/min-max-filter.h M common/thrift/ImpalaInternalService.thrift M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test 37 files changed, 1,464 insertions(+), 146 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/7793/5 -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <mjac...@apache.org> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com>