[
https://issues.apache.org/jira/browse/CALCITE-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian Hyde resolved CALCITE-938.
---------------------------------
Resolution: Fixed
Fixed in
http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/52b06213.
Thanks for the patch, [~maryannxue]!
> More accurate rowCount for Aggregate applied to already unique keys
> -------------------------------------------------------------------
>
> Key: CALCITE-938
> URL: https://issues.apache.org/jira/browse/CALCITE-938
> Project: Calcite
> Issue Type: Improvement
> Reporter: Maryann Xue
> Assignee: Maryann Xue
> Priority: Minor
> Fix For: 1.5.0
>
> Attachments: CALCITE-938.patch
>
>
> If columns in "select distinct" are already distinct, there can be two sets
> of equivalent rel before and after AggregateRemoveRule.
> {code}
> agg
> | input
> input
> 10.0 100.0
> {code}
> Based on the default implementation of rel metadata, the rowCount of the
> "before" rel is only 1/10 of that of the "after" rel, but meanwhile the
> "after" rel is definitely cheaper. So the Volcano planner would most likely
> either fail to pick the cheapest one or have an inconsistent state due to
> CALCITE-830.
> An example (based EnumerableRel cost model):
> The plan for
> {code}
> select empno, d.deptno
> from "scott".emp
> join (select distinct deptno from "scott".dept) d
> using (deptno);
> {code}
> would be
> {code}
> EnumerableCalc(expr#0..2=[{inputs}], EMPNO=[$t1], DEPTNO=[$t0])
> EnumerableJoin(condition=[=($0, $2)], joinType=[inner])
> EnumerableAggregate(group=[$0])
> EnumerableTableScan(table=[[scott, DEPT]])
> EnumerableCalc(expr#0..7=[{inputs}], EMPNO=[$t0], DEPTNO=[$t7])
> EnumerableTableScan(table=[[scott, EMP]])
> {code}
> , while it should be
> {code}
> EnumerableCalc(expr#0..2=[{inputs}], EMPNO=[$t1], DEPTNO=[$t0])
> EnumerableJoin(condition=[=($0, $2)], joinType=[inner])
> EnumerableCalc(expr#0..2=[{inputs}], DEPTNO=[$t0])
> EnumerableTableScan(table=[[scott, DEPT]])
> EnumerableCalc(expr#0..7=[{inputs}], EMPNO=[$t0], DEPTNO=[$t7])
> EnumerableTableScan(table=[[scott, EMP]])
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)