[jira] [Commented] (CALCITE-2130) Converting subquery to join is not always giving equivalent behavior

2018-05-22 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484316#comment-16484316
 ] 

Julian Hyde commented on CALCITE-2130:
--

[~atris] That would be great.

> Converting subquery to join is not always giving equivalent behavior
> 
>
> Key: CALCITE-2130
> URL: https://issues.apache.org/jira/browse/CALCITE-2130
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Paul Jackson
>Assignee: Julian Hyde
>Priority: Major
>
> Starting with this DDL:
> {code:SQL}
> create table friends (
>   id int not null,
>   friend_id int,
>   primary key (id),
>   foreign key (friend_id) references friends (id)
> );
> insert into friends values
> (210, null),
> (209, 210),
> (202, 209),
> (208, 202),
> (207, 209),
> (203, 207),
> (201, null),
> (204, null),
> (205, null),
> (206, 209);
> {code}
> This query looks up a row by id and performs another lookup for id by the 
> friend id:
> {code:SQL}
> SELECT id,
>  (SELECT id
>   FROM friends as f2
>   WHERE f2.friend_id = f1.friend_id) AS friend
> FROM friends as f1
> WHERE id = '203'
> idfriend_id   friend
> 203   207 203
> {code}
> This query only returns a result for rows with unique values for friend_id. 
> Replacing 203 with 202 leads to this error (MySql): {{Error Code: 1242. 
> Subquery returns more than 1 row}}
> I converted this query to a RelNode and then converted it back to SQL using 
> the MySQL dialect:
> {code:Java}
> Planner aPlanner = Frameworks.getPlanner(aConfig);
> SqlNode aQuery = aPlanner.parse(theSql);
> aQuery = aPlanner.validate(aQuery);
> RelNode aRelNode = aPlanner.rel(aQuery).project();
> RelToSqlConverter aSqlConverter = new RelToSqlConverter(aSqlDialect);
> SqlNode aSqlNode = aSqlConverter.visitChild(0, aRelNode).asStatement();
> {code}
> This gives the following plan and SQL:
> {code}
> LogicalProject(id=[$0], friend_id=[$1], friend=[$2])
>   LogicalProject(id=[$0], friend_id=[$1], $f0=[$3])
> LogicalJoin(condition=[=($1, $2)], joinType=[left])
>   LogicalFilter(condition=[=($0, '203')])
> JdbcTableScan(table=[[stardog, friends]])
>   LogicalAggregate(group=[{0}], agg#0=[SINGLE_VALUE($1)])
> LogicalProject(friend_id=[$1], id=[$0])
>   LogicalProject(id=[$0], friend_id=[$1])
> LogicalFilter(condition=[IS NOT NULL($1)])
>   JdbcTableScan(table=[[stardog, friends]])
> SELECT `t`.`id`, `t`.`friend_id`, `t2`.`$f1` AS `$f0`
> FROM (SELECT *
> FROM `stardog`.`friends`
> WHERE `id` = '203') AS `t`
> LEFT JOIN (SELECT `friend_id`, CASE COUNT(`id`) WHEN 0 THEN NULL WHEN 1 THEN 
> `id` ELSE (SELECT NULL
> UNION ALL
> SELECT NULL) END AS `$f1`
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL
> GROUP BY `friend_id`) AS `t2` ON `t`.`friend_id` = `t2`.`friend_id`
> {code}
> The MySQL implementation for SINGLE_VALUE is the CASE clause that causes an 
> 1242 error when the id count is greater than 1 by invoking a UNION ALL on two 
> NULL rows. In theory, this should return the ID when it is a unique value and 
> throw an error when there are multiple. Instead, MySQL will return the 1242 
> error for all values of id, including 203.
> Note, the JOIN subquery works if you add a WHERE clause expression to 
> constrain the value of freind_id:
> {code:SQL}
> SELECT `friend_id`, CASE COUNT(`id`) WHEN 0 THEN NULL WHEN 1 THEN `id` ELSE 
> (SELECT NULL
> UNION ALL
> SELECT NULL) END AS `$f1`
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL AND `friend_id` = '207'
> GROUP BY `friend_id`
> {code}
> Substituting friend_id for 207 leads to the 1242 error, as intended.
> This JOIN works on some dialects, but I think it is because different 
> dialects can use different join implementations. If the join performs the 
> JOIN ON SELECT clause without adding a where clause expression to constrain 
> friend_id (collecting all rows with a non-null friend_id), it will encounter 
> this error. Implementations that use a join algorithm that does constrain the 
> friend_id to the desired value will experience the error only when there are 
> multiple rows with the same friend_id, which behaves like the original query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2130) Converting subquery to join is not always giving equivalent behavior

2018-05-21 Thread Atri Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482226#comment-16482226
 ] 

Atri Sharma commented on CALCITE-2130:
--

[~julianhyde] I will like to take a crack at this one, if that is fine

> Converting subquery to join is not always giving equivalent behavior
> 
>
> Key: CALCITE-2130
> URL: https://issues.apache.org/jira/browse/CALCITE-2130
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Paul Jackson
>Assignee: Julian Hyde
>Priority: Major
>
> Starting with this DDL:
> {code:SQL}
> create table friends (
>   id int not null,
>   friend_id int,
>   primary key (id),
>   foreign key (friend_id) references friends (id)
> );
> insert into friends values
> (210, null),
> (209, 210),
> (202, 209),
> (208, 202),
> (207, 209),
> (203, 207),
> (201, null),
> (204, null),
> (205, null),
> (206, 209);
> {code}
> This query looks up a row by id and performs another lookup for id by the 
> friend id:
> {code:SQL}
> SELECT id,
>  (SELECT id
>   FROM friends as f2
>   WHERE f2.friend_id = f1.friend_id) AS friend
> FROM friends as f1
> WHERE id = '203'
> idfriend_id   friend
> 203   207 203
> {code}
> This query only returns a result for rows with unique values for friend_id. 
> Replacing 203 with 202 leads to this error (MySql): {{Error Code: 1242. 
> Subquery returns more than 1 row}}
> I converted this query to a RelNode and then converted it back to SQL using 
> the MySQL dialect:
> {code:Java}
> Planner aPlanner = Frameworks.getPlanner(aConfig);
> SqlNode aQuery = aPlanner.parse(theSql);
> aQuery = aPlanner.validate(aQuery);
> RelNode aRelNode = aPlanner.rel(aQuery).project();
> RelToSqlConverter aSqlConverter = new RelToSqlConverter(aSqlDialect);
> SqlNode aSqlNode = aSqlConverter.visitChild(0, aRelNode).asStatement();
> {code}
> This gives the following plan and SQL:
> {code}
> LogicalProject(id=[$0], friend_id=[$1], friend=[$2])
>   LogicalProject(id=[$0], friend_id=[$1], $f0=[$3])
> LogicalJoin(condition=[=($1, $2)], joinType=[left])
>   LogicalFilter(condition=[=($0, '203')])
> JdbcTableScan(table=[[stardog, friends]])
>   LogicalAggregate(group=[{0}], agg#0=[SINGLE_VALUE($1)])
> LogicalProject(friend_id=[$1], id=[$0])
>   LogicalProject(id=[$0], friend_id=[$1])
> LogicalFilter(condition=[IS NOT NULL($1)])
>   JdbcTableScan(table=[[stardog, friends]])
> SELECT `t`.`id`, `t`.`friend_id`, `t2`.`$f1` AS `$f0`
> FROM (SELECT *
> FROM `stardog`.`friends`
> WHERE `id` = '203') AS `t`
> LEFT JOIN (SELECT `friend_id`, CASE COUNT(`id`) WHEN 0 THEN NULL WHEN 1 THEN 
> `id` ELSE (SELECT NULL
> UNION ALL
> SELECT NULL) END AS `$f1`
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL
> GROUP BY `friend_id`) AS `t2` ON `t`.`friend_id` = `t2`.`friend_id`
> {code}
> The MySQL implementation for SINGLE_VALUE is the CASE clause that causes an 
> 1242 error when the id count is greater than 1 by invoking a UNION ALL on two 
> NULL rows. In theory, this should return the ID when it is a unique value and 
> throw an error when there are multiple. Instead, MySQL will return the 1242 
> error for all values of id, including 203.
> Note, the JOIN subquery works if you add a WHERE clause expression to 
> constrain the value of freind_id:
> {code:SQL}
> SELECT `friend_id`, CASE COUNT(`id`) WHEN 0 THEN NULL WHEN 1 THEN `id` ELSE 
> (SELECT NULL
> UNION ALL
> SELECT NULL) END AS `$f1`
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL AND `friend_id` = '207'
> GROUP BY `friend_id`
> {code}
> Substituting friend_id for 207 leads to the 1242 error, as intended.
> This JOIN works on some dialects, but I think it is because different 
> dialects can use different join implementations. If the join performs the 
> JOIN ON SELECT clause without adding a where clause expression to constrain 
> friend_id (collecting all rows with a non-null friend_id), it will encounter 
> this error. Implementations that use a join algorithm that does constrain the 
> friend_id to the desired value will experience the error only when there are 
> multiple rows with the same friend_id, which behaves like the original query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2130) Converting subquery to join is not always giving equivalent behavior

2018-01-10 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320958#comment-16320958
 ] 

Julian Hyde commented on CALCITE-2130:
--

I see the problem. The decorrelated query is evaluating the sub-query for all 
values of {{id}}, not just '203', and one of those values gives an error.

> Converting subquery to join is not always giving equivalent behavior
> 
>
> Key: CALCITE-2130
> URL: https://issues.apache.org/jira/browse/CALCITE-2130
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0
>Reporter: Paul Jackson
>Assignee: Julian Hyde
>
> Starting with this DDL:
> {code:SQL}
> create table friends (
>   id int not null,
>   friend_id int,
>   primary key (id),
>   foreign key (friend_id) references friends (id)
> );
> insert into friends values
> (210, null),
> (209, 210),
> (202, 209),
> (208, 202),
> (207, 209),
> (203, 207),
> (201, null),
> (204, null),
> (205, null),
> (206, 209);
> {code}
> This query looks up a row by id and performs another lookup for id by the 
> friend id:
> {code:SQL}
> SELECT id,
>  (SELECT id
>   FROM friends as f2
>   WHERE f2.friend_id = f1.friend_id) AS friend
> FROM friends as f1
> WHERE id = '203'
> idfriend_id   friend
> 203   207 203
> {code}
> This query only returns a result for rows with unique values for friend_id. 
> Replacing 203 with 202 leads to this error (MySql): {{Error Code: 1242. 
> Subquery returns more than 1 row}}
> I converted this query to a RelNode and then converted it back to SQL using 
> the MySQL dialect:
> {code:Java}
> Planner aPlanner = Frameworks.getPlanner(aConfig);
> SqlNode aQuery = aPlanner.parse(theSql);
> aQuery = aPlanner.validate(aQuery);
> RelNode aRelNode = aPlanner.rel(aQuery).project();
> RelToSqlConverter aSqlConverter = new RelToSqlConverter(aSqlDialect);
> SqlNode aSqlNode = aSqlConverter.visitChild(0, aRelNode).asStatement();
> {code}
> This gives the following plan and SQL:
> {code}
> LogicalProject(id=[$0], friend_id=[$1], friend=[$2])
>   LogicalProject(id=[$0], friend_id=[$1], $f0=[$3])
> LogicalJoin(condition=[=($1, $2)], joinType=[left])
>   LogicalFilter(condition=[=($0, '203')])
> JdbcTableScan(table=[[stardog, friends]])
>   LogicalAggregate(group=[{0}], agg#0=[SINGLE_VALUE($1)])
> LogicalProject(friend_id=[$1], id=[$0])
>   LogicalProject(id=[$0], friend_id=[$1])
> LogicalFilter(condition=[IS NOT NULL($1)])
>   JdbcTableScan(table=[[stardog, friends]])
> SELECT `t`.`id`, `t`.`friend_id`, `t2`.`$f1` AS `$f0`
> FROM (SELECT *
> FROM `stardog`.`friends`
> WHERE `id` = '203') AS `t`
> LEFT JOIN (SELECT `friend_id`, CASE COUNT(`id`) WHEN 0 THEN NULL WHEN 1 THEN 
> `id` ELSE (SELECT NULL
> UNION ALL
> SELECT NULL) END AS `$f1`
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL
> GROUP BY `friend_id`) AS `t2` ON `t`.`friend_id` = `t2`.`friend_id`
> {code}
> The MySQL implementation for SINGLE_VALUE is the CASE clause that causes an 
> 1242 error when the id count is greater than 1 by invoking a UNION ALL on two 
> NULL rows. In theory, this should return the ID when it is a unique value and 
> throw an error when there are multiple. Instead, MySQL will return the 1242 
> error for all values of id, including 203.
> Note, the JOIN subquery works if you add a WHERE clause expression to 
> constrain the value of freind_id:
> {code:SQL}
> SELECT `friend_id`, CASE COUNT(`id`) WHEN 0 THEN NULL WHEN 1 THEN `id` ELSE 
> (SELECT NULL
> UNION ALL
> SELECT NULL) END AS `$f1`
> FROM `stardog`.`friends`
> WHERE `friend_id` IS NOT NULL AND `friend_id` = '207'
> GROUP BY `friend_id`
> {code}
> Substituting friend_id for 207 leads to the 1242 error, as intended.
> This JOIN works on some dialects, but I think it is because different 
> dialects can use different join implementations. If the join performs the 
> JOIN ON SELECT clause without adding a where clause expression to constrain 
> friend_id (collecting all rows with a non-null friend_id), it will encounter 
> this error. Implementations that use a join algorithm that does constrain the 
> friend_id to the desired value will experience the error only when there are 
> multiple rows with the same friend_id, which behaves like the original query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)