[ 
https://issues.apache.org/jira/browse/CALCITE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807684#comment-16807684
 ] 

Danny Chan commented on CALCITE-2967:
-------------------------------------

What i'm curious about is after decorrelation, an in clause will be re-written 
to a inner-join but not a semi-join,

so how was your SemiJoin node created ?

> New rule to remove SemiJoin
> ---------------------------
>
>                 Key: CALCITE-2967
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2967
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Ruben Quesada Lopez
>            Assignee: Ruben Quesada Lopez
>            Priority: Major
>
> (As far as I know, there is no rule to achieve what I am about to describe, 
> if there exists already a way to do it, please let me know).
> In some specific situations, a SemiJoin can be completely removed and 
> replaced by its left child with an appropriate transformed filter.
> Let us say we want to retrieve all employees whose department satisfy a 
> certain condition, i.e.:
> {code}
> SELECT * from Employee e
> WHERE e.deptno IN
>   (SELECT d.deptno FROM Department d
>     WHERE <condition>)
> {code}
> Which would translate to something like:
> {code}
> SemiJoin (e.deptno=d.deptno)
>     Scan (table=Employee as e)
>     Filter(<condition>)
>         Scan (table=Department as d)
> {code}
> In a "normal" scenario, e.g. "all employees from Sales department", the plan 
> could no be simplified:
> {code}
> SemiJoin (e.deptno=d.deptno)
>     Scan (table=Employee as e)
>     Filter(d.name="Sales")
>         Scan (table=Department as d)
> {code}
> But with a specific condition, based on deptno, e.g. "all employees whose 
> deptno is greater than 10":
> {code}
> SemiJoin (e.deptno=d.deptno)
>     Scan (table=Employee as e)
>     Filter(d.deptno>10)
>         Scan (table=Department as d)
> {code}
> The plan could be simplified: the SemiJoin is not actually needed, we can 
> perform the query with a single scan and a converted filter:
> {code}
> Filter(e.deptno>10)
>     Scan (table=Employee as e)
> {code}
> The goal would be to provide a new rule to achieve that (since there is 
> already a SemiJoinRemoveRule, we could name this new rule as e.g. 
> SemiJoinSimplifyRule?).
> I know that, ideally, this rule should not be needed because the plan could 
> be directly written without the SemiJoin, but let's say that we are in a 
> situation were the plan is systematically generated with the same pattern, 
> and there is no way to know in advanced the filter condition that will be 
> used within.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to