[ https://issues.apache.org/jira/browse/CALCITE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188707#comment-17188707 ]
Rui Wang commented on CALCITE-4208: ----------------------------------- I am not familiar with the context of existing row count estimation model, just based on the formula here, I think: innerJoinRowCount = leftRowCount * rightRowCount * mq.getSelectivity(join, condition) leftRowCount = leftRowCount + innerJoinRowCount = leftRowCount * (1 + rightRowCount * mq.getSelectivity(join, condition)) similarly for right join So if rightRowCount * mq.getSelectivity(join, condition) is much larger, that 1 can be ignored. If 1 is the dominate part, the row count estimation won't be a big number anyway. I am thinking that is why at least INNER/LEFT/RIGHT have the same model. Full join could have a similar argument. > Improve metadata row count for Join > ----------------------------------- > > Key: CALCITE-4208 > URL: https://issues.apache.org/jira/browse/CALCITE-4208 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: Ruben Q L > Priority: Major > > Currently, the default metadata row count for join > {{RelMdRowCount#getRowCount(Join rel, RelMetadataQuery mq)}} relies on > {{RelMdUtil.getJoinRowCount}}. This method has several issues: > - In case of ANTI join, it returns the same estimation as a SEMI join > - In other cases (INNER, LEFT, RIGHT, FULL), it returns always the same > formula: > {{leftRowCount * rightRowCount * mq.getSelectivity(join, condition)}} > which seems valid for an INNER join, but not for LEFT / RIGHT / FULL. -- This message was sent by Atlassian Jira (v8.3.4#803005)