[ https://issues.apache.org/jira/browse/IMPALA-12371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltán Borók-Nagy reassigned IMPALA-12371: ------------------------------------------ Assignee: Zoltán Borók-Nagy > Add better cardinality estimation for Iceberg V2 tables with deletes > -------------------------------------------------------------------- > > Key: IMPALA-12371 > URL: https://issues.apache.org/jira/browse/IMPALA-12371 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Reporter: Zoltán Borók-Nagy > Assignee: Zoltán Borók-Nagy > Priority: Major > Labels: impala-iceberg > > IMPALA-11797 is about the generic case, i.e. better cardinality for all ANTI > JOIN operators. > For Iceberg V2 we can safely come up with a better cardinality estimation as > we can assume that all rows at RHS have a match in LHS when there is no > filtering. Though RHS might contain duplicate rows, see: > https://github.com/apache/iceberg/blob/462a203e67dd42d111a7fd2d3a0090b5aeb80833/api/src/main/java/org/apache/iceberg/RowDelta.java#L132-L133 > So we can come up something like this: > Cardinality of DELETE operator = Cardinality(LHS) - (Cardinality(RHS) * > selectivity of LHS) > With some safety checks if it becomes negative (due to duplicates in RHS). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org