[email protected] has uploaded this change for review. (
http://gerrit.cloudera.org:8080/19494
Change subject: IMPALA-11802: Optimize count(*) queries for Iceberg V2 position
delete tables
......................................................................
IMPALA-11802: Optimize count(*) queries for Iceberg V2 position delete tables
The SCAN plan of count star query for Iceberg V2 position delete tables
as follows:
AGGREGATE
COUNT(*)
|
UNION ALL
/ \
/ \
/ \
SCAN all ANTI JOIN
datafiles / \
without / \
deletes SCAN SCAN
datafiles deletes
Since Iceberg provides the number of records in a file(record_count), we
can use this to optimize a simple count star query for Iceberg V2
position delete tables. Firstly, the number of records of all DataFiles
without corresponding DeleteFiles can be calculated by Iceberg meta
files. And then rewrite the query as follows:
ArithmeticExpr(ADD)
/ \
/ \
/ \
record_count AGGREGATE
of allee COUNT(*)
datafiles |
without ANTI JOIN
deletes / \
/ \
SCAN SCAN
datafiles deletes
Testing:
* Existing tests
* Added e2e tests
Change-Id: I8172c805121bf91d23fe063f806493afe2f03d41
---
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/rewrite/CountStarToConstRule.java
M
testdata/workloads/functional-query/queries/QueryTest/iceberg-plain-count-star-optimization.test
M
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
10 files changed, 240 insertions(+), 44 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/19494/2
--
To view, visit http://gerrit.cloudera.org:8080/19494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8172c805121bf91d23fe063f806493afe2f03d41
Gerrit-Change-Number: 19494
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <[email protected]>