[ 
https://issues.apache.org/jira/browse/CALCITE-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728993#comment-17728993
 ] 

Julian Hyde commented on CALCITE-5740:
--------------------------------------

Your example makes things clearer. I see how the Aggregate allows you deduce 
that no columns are used from the right side of the join. 

Your example is only valid if b.key is unique. Otherwise COUNT will return 
different results before and after the transformation. So, no columns being 
used is a necessary but not sufficient condition to apply the rule. You should 
describe what those conditions are (including which join types are allowed).

If you require uniqueness on the right hand side then I’m not sure there’s a 
point converting the join to a semijoin. Or rather, I think there’s an existing 
rule that removes an unnecessary Aggregate if it’s input is already unique.

Can you come up with an example (perhaps using EMP and DEPT, because we know 
their PK and FK constraints) where this rule can achieve something to other 
rule(s) can?

> Support for AggToSemiJoinRule
> -----------------------------
>
>                 Key: CALCITE-5740
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5740
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Rong Rong
>            Priority: Major
>
> **Description**
> Currently we only have JoinToSemiJoin and ProjectToSemiJoin rule.  which in 
> the rule itself it performance check and see if the project accesses columns 
> from the RHS result
> This can be extended to Aggregate as well, experimental code: 
> https://github.com/walterddr/calcite/pull/1/files
> **Alternative**
> Alternative is to add a project/calc between the join and the aggregate to 
> activate the project-to-semi-join rule. please share if there's any other 
> alternative if I haven't considered. 
> thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to