[ 
https://issues.apache.org/jira/browse/CALCITE-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Bertolacci closed CALCITE-4852.
-----------------------------------
    Resolution: Duplicate

> RelToSqlConverter creates unparseable SQL string from right associative 
> multi-way join.
> ---------------------------------------------------------------------------------------
>
>                 Key: CALCITE-4852
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4852
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Ian Bertolacci
>            Priority: Minor
>
> As discussed inĀ CALCITE-35, Calcite cannot parse parenthesized join 
> expressions (such as `select ... from A join (B join C)`).
> But as suggested in CALCITE-2152, those expressions can be converted into 
> parenthesized select on the join expression (`select ... from A join (select 
> ... from B join C)`).
> However, RelToSqlConverter will convert the RelNode representation of 
> right-associative joins into a parenthesized join expression, which is 
> unparseable.
> For example, this RelNode tree
> {code}
> LogicalProject(_T5_ID=[$2], C0_51=[$3], _T1_ID=[$1], C0_53=[$4], _T3_ID=[$0])
>   LogicalJoin(condition=[=($0, $4)], joinType=[inner])
>     LogicalProject(_T3_ID=[$0])
>       LogicalTableScan(table=[[QUERY, T3]], fields=[[0, 1, 2]])
>       // RHS child of right-associate join
>       LogicalJoin(condition=[=($0, $2)], joinType=[inner])
>         LogicalProject(_T1_ID=[$0])
>           LogicalTableScan(table=[[QUERY, T1]], fields=[[0, 1, 2]])
>         LogicalProject(_T5_ID=[$0], C0_51=[$1], C0_53=[$3])
>           LogicalTableScan(table=[[QUERY, T5]], fields=[[0, 1, 2, 3]])
> {code}
> creates the following unparseable SQL string
> {code}
> SELECT `t1`.`_t5_id`,
>        `t1`.`c0_51`,
>        `t0`.`_t1_id`,
>        `t1`.`c0_53`,
>        `t`.`_t3_id`
> FROM   (SELECT `id` AS `_T3_ID`
>         FROM   `query`.`t3`) AS `t`
>        INNER JOIN ((SELECT `id` AS `_T1_ID`
>                    FROM   `query`.`t1`) AS `t0`
>                    INNER JOIN (SELECT `id` AS `_T5_ID`,
>                                       `c0_51`,
>                                       `c0_53`
>                                FROM   `query`.`t5`) AS `t1`
>                            ON `t0`.`_t1_id` = `t1`.`c0_51`)
>                ON `t`.`_t3_id` = `t1`.`c0_53`
> {code}
> This is an issue, because it is very easy to make such trees, and ideally, 
> all SQL strings generated by Calcite would also be parseable by Calcite.
> To get around this, we found that the insertion of a projection node between 
> a join and it's RHS child (the problematic join) forces RelToSqlConverter to 
> create a parenthesized select statement, which *is* parseable, and (at least 
> for us) semantically identical.
> For example, this virtually identical tree:
> {code}
> LogicalProject(_T5_ID=[$2], C0_51=[$3], _T1_ID=[$1], C0_53=[$4], _T3_ID=[$0])
>   LogicalJoin(condition=[=($0, $4)], joinType=[inner])
>     LogicalProject(_T3_ID=[$0])
>       LogicalTableScan(table=[[QUERY, T3]], fields=[[0, 1, 2]])
>     // Inserted projection on top of RHS child
>     LogicalProject(_T1_ID=[$0], _T5_ID=[$1], C0_51=[$2], C0_53=[$3])
>       // RHS child of right-associate join
>       LogicalJoin(condition=[=($0, $2)], joinType=[inner])
>         LogicalProject(_T1_ID=[$0])
>           LogicalTableScan(table=[[QUERY, T1]], fields=[[0, 1, 2]])
>         LogicalProject(_T5_ID=[$0], C0_51=[$1], C0_53=[$3])
>           LogicalTableScan(table=[[QUERY, T5]], fields=[[0, 1, 2, 3]])
> {code}
> creates the following parseable SQL string
> {code}
> SELECT `t2`.`_t5_id`,
>        `t2`.`c0_51`,
>        `t2`.`_t1_id`,
>        `t2`.`c0_53`,
>        `t`.`_t3_id`
> FROM   (SELECT `id` AS `_T3_ID`
>         FROM   `query`.`t3`) AS `t`
>        INNER JOIN (SELECT *
>                    FROM   (SELECT `id` AS `_T1_ID`
>                            FROM   `query`.`t1`) AS `t0`
>                           INNER JOIN (SELECT `id` AS `_T5_ID`,
>                                              `c0_51`,
>                                              `c0_53`
>                                       FROM   `query`.`t5`) AS `t1`
>                                   ON `t0`.`_t1_id` = `t1`.`c0_51`) AS `t2`
>                ON `t`.`_t3_id` = `t2`.`c0_53`
> {code}
> We solved this our RelToSqlConverter extending class by cloning the parent 
> join and inserting a projection between it and an RHS join:
> {code}
> override def visit(join: Join): Result = {
>     if (join.getRight.isInstanceOf[Join]) {
>       super.visit(
>         join.copy(
>           join.getTraitSet,
>           join.getCondition,
>           join.getLeft,
>           new LogicalProject(
>             join.getCluster,
>             join.getTraitSet,
>             join.getHints,
>             join.getRight,
>             // Construct projection of all expressions from the rhs join
>             join.getRight.getRowType.getFieldList.asScala.map(field => new 
> RexInputRef(field.getIndex, field.getType)).asJava,
>             join.getRight.getRowType
>           ),
>           join.getJoinType,
>           join.isSemiJoinDone
>         )
>       )
>     } else {
>       super.visit(join)
>     }
>   }
> {code}
> (we tried to do it "correctly" by constructing a proper Result object, but 
> found it too difficult. Hopefully someone else can figure out how do it the 
> right way).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to