[ 
https://issues.apache.org/jira/browse/IMPALA-12371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy reassigned IMPALA-12371:
------------------------------------------

    Assignee: Zoltán Borók-Nagy

> Add better cardinality estimation for Iceberg V2 tables with deletes
> --------------------------------------------------------------------
>
>                 Key: IMPALA-12371
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12371
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> IMPALA-11797 is about the generic case, i.e. better cardinality for all ANTI 
> JOIN operators.
> For Iceberg V2 we can safely come up with a better cardinality estimation as 
> we can assume that all rows at RHS have a match in LHS when there is no 
> filtering. Though RHS might contain duplicate rows, see:
> https://github.com/apache/iceberg/blob/462a203e67dd42d111a7fd2d3a0090b5aeb80833/api/src/main/java/org/apache/iceberg/RowDelta.java#L132-L133
> So we can come up something like this:
> Cardinality of DELETE operator = Cardinality(LHS) - (Cardinality(RHS) * 
> selectivity of LHS)
> With some safety checks if it becomes negative (due to duplicates in RHS).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to