[ https://issues.apache.org/jira/browse/CALCITE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807728#comment-16807728 ]
Ruben Quesada Lopez commented on CALCITE-2967: ---------------------------------------------- In my case, the SemiJoin node can be created from a Join via SemiJoinRule, or directly by using RelBuilder#semiJoin method. > New rule to remove SemiJoin > --------------------------- > > Key: CALCITE-2967 > URL: https://issues.apache.org/jira/browse/CALCITE-2967 > Project: Calcite > Issue Type: New Feature > Reporter: Ruben Quesada Lopez > Assignee: Ruben Quesada Lopez > Priority: Major > > (As far as I know, there is no rule to achieve what I am about to describe, > if there exists already a way to do it, please let me know). > In some specific situations, a SemiJoin can be completely removed and > replaced by its left child with an appropriate transformed filter. > Let us say we want to retrieve all employees whose department satisfy a > certain condition, i.e.: > {code} > SELECT * from Employee e > WHERE e.deptno IN > (SELECT d.deptno FROM Department d > WHERE <condition>) > {code} > Which would translate to something like: > {code} > SemiJoin (e.deptno=d.deptno) > Scan (table=Employee as e) > Filter(<condition>) > Scan (table=Department as d) > {code} > In a "normal" scenario, e.g. "all employees from Sales department", the plan > could no be simplified: > {code} > SemiJoin (e.deptno=d.deptno) > Scan (table=Employee as e) > Filter(d.name="Sales") > Scan (table=Department as d) > {code} > But with a specific condition, based on deptno, e.g. "all employees whose > deptno is greater than 10": > {code} > SemiJoin (e.deptno=d.deptno) > Scan (table=Employee as e) > Filter(d.deptno>10) > Scan (table=Department as d) > {code} > The plan could be simplified: the SemiJoin is not actually needed, we can > perform the query with a single scan and a converted filter: > {code} > Filter(e.deptno>10) > Scan (table=Employee as e) > {code} > The goal would be to provide a new rule to achieve that (since there is > already a SemiJoinRemoveRule, we could name this new rule as e.g. > SemiJoinSimplifyRule?). > I know that, ideally, this rule should not be needed because the plan could > be directly written without the SemiJoin, but let's say that we are in a > situation were the plan is systematically generated with the same pattern, > and there is no way to know in advanced the filter condition that will be > used within. -- This message was sent by Atlassian JIRA (v7.6.3#76005)