Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/20460

to look at the new patch set (#2).

Change subject: IMPALA-12371: Add better cardinality estimation for Iceberg V2 
tables with deletes
......................................................................

IMPALA-12371: Add better cardinality estimation for Iceberg V2 tables with 
deletes

Currently IcebergDeleteNode's cardinality is the same as the LHS's
cardinality, i.e. we don't take the RHS into account. The RHS contains
the position delete records, so it is a fair assumption that all records
at RHS remove a record from RHS (duplicated delete records should be
extremely rare).

If there are conjuncts on the Iceberg table we can assume that they have
the same selectivity on the data records and on the delete records.

With the above assumptions this change updates the cardinality of the
IcebergDeleteNode with basically the following formula:

 Card(IcebergDeleteNode) = Card(LHS) - Selectivity(LHS) * Card(RHS);

To deal with edge cases when there are lots of duplicated delete
records (shouldn't happen in normal usage), we actually use a slightly
more complex formula:

 Card(IcebergDeleteNode) =
   Max(
     Min(1, Card(LHS))),
     Card(LHS) - Selectivity(LHS) * Card(RHS)
   );

Testing:
 * updated the planner tests

Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5
---
M fe/src/main/java/org/apache/impala/planner/IcebergDeleteNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
3 files changed, 94 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/20460/2
--
To view, visit http://gerrit.cloudera.org:8080/20460
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I988dc8d7e1074932c460b3702d3381341e5b23c5
Gerrit-Change-Number: 20460
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>

Reply via email to