Agreed, it should be in reverse order. Translate to semi-join (or
anti-join for NOT EXISTS), then optionally use a rule to rewrite semi-
or anti-join to Join+Aggregate.

Note that if the EXISTS is in a disjunction (e.g. "delete from orders
where exists (select null from order_items where ...) or shipping >
20") we cannot use a semi-join. We have to use a left join, using an
indicator column on the right-hand side that will be null iff there is
no match. Which is what we do currently.

On Wed, Feb 19, 2020 at 10:03 AM Haisheng Yuan <h.y...@alibaba-inc.com> wrote:
>
> Hi Christian,
>
> For the query in your example, Calcite first generates inner join plan with 
> aggregate child, then through SemJoinRule transform the inner join to semi or 
> antisemi join. The reason to have inner join is that it allows join 
> commutativity, which is good for generating a potential better plan with 
> nestedloop join or hash join.
>
> Admittedly, this process in Calcite is counter intuitive. It should be in 
> reverse order, first generate a semi or anti-semi join, then generate an 
> inner/outer join.
>
> - Haisheng
>
> ------------------------------------------------------------------
> 发件人:Christian Beikov<christian.bei...@gmail.com>
> 日 期:2020年02月19日 21:12:13
> 收件人:<dev@calcite.apache.org>
> 主 题:Translation of SQL EXISTS
>
> Hello,
>
> I'm a bit confused about how the SQL EXISTS predicate is translated. I'd
> assume that an EXISTS is translated in relational algebra to a SEMI- and
> NOT EXISTS to an ANTI-join, but it's not.
>
> PlannerImpl p = new PlannerImpl(config);
> SqlNode sqlNode = p.parse("delete from _order o where exists (select 1
> from order_position p where o.id = p.order_id)");
> p.validate(sqlNode);
> RelRoot rel = p.rel(sqlNode);
> RelToSqlConverter sqlConverter = new RelToSqlConverter(dialect);
> SqlImplementor.Result result = sqlConverter.visitChild(0, rel.rel);
> sqlWriter.format(result.asStatement());
>
> Worse, when printing this, I only get DELETE FROM "public"."_order" i.e.
> the EXISTS part is not rendered. This is the plan I get.
>
> LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
> flattened=[true])
>    LogicalProject(inputs=[0])
>      LogicalProject(inputs=[0], exprs=[[CAST($1):BIGINT, CAST($2):BOOLEAN]])
>        LogicalJoin(condition=[=($0, $1)], joinType=[inner])
>          JdbcTableScan(table=[[adhoc, _order]])
>          LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
>            LogicalProject(exprs=[[$1, true]])
>              JdbcTableScan(table=[[adhoc, order_position]])
>
> I'd expect something along the lines of
>
> LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
> flattened=[true])
>    LogicalProject(inputs=[0])
>        LogicalJoin(condition=[=($0, $1)], joinType=[semi])
>          JdbcTableScan(table=[[adhoc, _order]])
>          JdbcTableScan(table=[[adhoc, order_position]])
>
> and for NOT EXISTS
>
> LogicalTableModify(table=[[adhoc, _order]], operation=[DELETE],
> flattened=[true])
>    LogicalProject(inputs=[0])
>        LogicalJoin(condition=[=($0, $1)], joinType=[anti])
>          JdbcTableScan(table=[[adhoc, _order]])
>          JdbcTableScan(table=[[adhoc, order_position]])
>
> Am I missing something and the current aggregate function translation
> makes sense?
>
> I constructed relational algebra structures for some other statements
> with SEMI- and ANTI-joins and already noticed that these join types
> weren't handled in
> org.apache.calcite.rel.rel2sql.RelToSqlConverter#visit(org.apache.calcite.rel.core.Join),
> which I fixed locally. Is the lack of a translation intentional?
>
> Is such a translation of SEMI- and ANTI-join to EXISTS and NOT EXISTS an
> over-simplification or would you say it's correct? As far as I
> understood from https://en.wikipedia.org/wiki/Relational_algebra this is
> correct.
>
> I'd be happy to contribute that back. I didn't look into the Sql-to-Rel
> translation for EXISTS and NOT EXISTS to SEMI- and ANTI-join yet, but I
> assume that's not that hard and I could add that.
>
> Regards,
>
> Christian
>
>

Reply via email to