[email protected] has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19494


Change subject: IMPALA-11802: Optimize count(*) queries for Iceberg V2 position 
delete tables
......................................................................

IMPALA-11802: Optimize count(*) queries for Iceberg V2 position delete tables

The SCAN plan of count star query for Iceberg V2 position delete tables
as follows:

    AGGREGATE
    COUNT(*)
        |
    UNION ALL
   /         \
  /           \
 /             \
SCAN all    ANTI JOIN
datafiles  /         \
without   /           \
deletes  SCAN         SCAN
         datafiles    deletes

Since Iceberg provides the number of records in a file(record_count), we
can use this to optimize a simple count star query for Iceberg V2
position delete tables. Firstly, the number of records of all DataFiles
without corresponding DeleteFiles can be calculated by Iceberg meta
files. And then rewrite the query as follows:

      ArithmeticExpr(ADD)
      /             \
     /               \
    /                 \
record_count       AGGREGATE
of allee           COUNT(*)
datafiles              |
without            ANTI JOIN
deletes           /         \
                 /           \
                SCAN        SCAN
                datafiles   deletes

Testing:
 * Existing tests
 * Added e2e tests

Change-Id: I8172c805121bf91d23fe063f806493afe2f03d41
---
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/rewrite/CountStarToConstRule.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-plain-count-star-optimization.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
10 files changed, 240 insertions(+), 44 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/19494/2
--
To view, visit http://gerrit.cloudera.org:8080/19494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8172c805121bf91d23fe063f806493afe2f03d41
Gerrit-Change-Number: 19494
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <[email protected]>

Reply via email to