[
https://issues.apache.org/jira/browse/CALCITE-714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian Hyde updated CALCITE-714:
--------------------------------
Fix Version/s: next
> Decorrelation does not push join condition into subquery
> --------------------------------------------------------
>
> Key: CALCITE-714
> URL: https://issues.apache.org/jira/browse/CALCITE-714
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.1.0-incubating
> Reporter: Aman Sinha
> Assignee: Aman Sinha
> Fix For: next
>
>
> When decorrelating a scalar aggregate subquery, we push copies of the tables
> from the outer query block into the subquery but don't push the join
> condition, thereby creating a cartesian join. This seems to be a regression.
> Query:
> {code}
> select count(*)
> from dfs.`/Users/asinha/data/tpch-sf1/nation` n,
> dfs.`/Users/asinha/data/tpch-sf1/orders` o
> where n.n_nationkey = o.o_orderkey
> and n.n_nationkey > (select avg(ps.ps_suppkey) from
> dfs.`/Users/asinha/data/tpch-sf1/partsupp` ps
> where n.n_regionkey = ps.ps_partkey);
> {code}
> Here's the plan on an earlier version of Calcite (I am not sure of the
> version number but it was before the decorrelation refactoring), Note the
> join between nation and orders at the leaf level has a equi-join condition.
> {code}
> AggregateRel(group=[{}], EXPR$0=[COUNT()])
> ProjectRel($f0=[$0])
> ProjectRel($f0=[0])
> FilterRel(condition=[>($1, $5)])
> ProjectRel(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3],
> o_orderkey=[$4], EXPR$0=[$6])
> JoinRel(condition=[=($2, $5)], joinType=[left])
> JoinRel(condition=[=($1, $4)], joinType=[inner])
> EnumerableTableAccessRel(table=[[dfs,
> /Users/asinha/data/tpch-sf1/nation]])
> EnumerableTableAccessRel(table=[[dfs,
> /Users/asinha/data/tpch-sf1/orders]])
> AggregateRel(group=[{0}], EXPR$0=[AVG($1)])
> ProjectRel($f0=[$1], ps_suppkey=[$0])
> ProjectRel(ps_suppkey=[$2], $f0=[$3])
> FilterRel(condition=[=($3, $1)])
> JoinRel(condition=[true], joinType=[inner])
> EnumerableTableAccessRel(table=[[dfs,
> /Users/asinha/data/tpch-sf1/partsupp]])
> AggregateRel(group=[{0}])
> ProjectRel($f0=[$2])
> JoinRel(condition=[=($1, $4)], joinType=[inner])
> EnumerableTableAccessRel(table=[[dfs,
> /Users/asinha/data/tpch-sf1/nation]])
> EnumerableTableAccessRel(table=[[dfs,
> /Users/asinha/data/tpch-sf1/orders]])
> {code}
> Here's the new plan (I am on version 1.1.0 but I think the plan has not
> changed in the latest version). Note the join between nation and orders at
> the leaf level does not have any join condition.
> {code}
> LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
> LogicalProject($f0=[$0])
> LogicalProject($f0=[0])
> LogicalFilter(condition=[AND(=($1, $4), >($1, $5))])
> LogicalProject(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3],
> o_orderkey=[$4], EXPR$0=[$6])
> LogicalJoin(condition=[=($2, $5)], joinType=[left])
> LogicalJoin(condition=[true], joinType=[inner])
> EnumerableTableScan(table=[[dfs,
> /Users/asinha/data/tpch-sf1/nation]])
> EnumerableTableScan(table=[[dfs,
> /Users/asinha/data/tpch-sf1/orders]])
> LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)])
> LogicalProject(n_regionkey=[$1], ps_suppkey=[$0])
> LogicalProject(ps_suppkey=[$2], n_regionkey=[$3])
> LogicalFilter(condition=[=($3, $1)])
> LogicalJoin(condition=[true], joinType=[inner])
> EnumerableTableScan(table=[[dfs,
> /Users/asinha/data/tpch-sf1/partsupp]])
> LogicalAggregate(group=[{0}])
> LogicalProject(n_regionkey=[$2])
> LogicalJoin(condition=[true], joinType=[inner])
> EnumerableTableScan(table=[[dfs,
> /Users/asinha/data/tpch-sf1/nation]])
> EnumerableTableScan(table=[[dfs,
> /Users/asinha/data/tpch-sf1/orders]])
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)