[ 
https://issues.apache.org/jira/browse/CALCITE-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-938.
---------------------------------
    Resolution: Fixed

Fixed in 
http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/52b06213. 
Thanks for the patch, [~maryannxue]!

> More accurate rowCount for Aggregate applied to already unique keys
> -------------------------------------------------------------------
>
>                 Key: CALCITE-938
>                 URL: https://issues.apache.org/jira/browse/CALCITE-938
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Maryann Xue
>            Assignee: Maryann Xue
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: CALCITE-938.patch
>
>
> If columns in "select distinct" are already distinct, there can be two sets 
> of equivalent rel before and after AggregateRemoveRule.
> {code}
> agg
>  |                  input
> input
> 10.0                100.0
> {code}
> Based on the default implementation of rel metadata, the rowCount of the 
> "before" rel is only 1/10 of that of the "after" rel, but meanwhile the 
> "after" rel is definitely cheaper. So the Volcano planner would most likely 
> either fail to pick the cheapest one or have an inconsistent state due to 
> CALCITE-830.
> An example (based EnumerableRel cost model):
> The plan for
> {code}
> select empno, d.deptno
> from "scott".emp
> join (select distinct deptno from "scott".dept) d
> using (deptno);
> {code}
> would be
> {code}
> EnumerableCalc(expr#0..2=[{inputs}], EMPNO=[$t1], DEPTNO=[$t0])
>   EnumerableJoin(condition=[=($0, $2)], joinType=[inner])
>     EnumerableAggregate(group=[$0])
>       EnumerableTableScan(table=[[scott, DEPT]])
>     EnumerableCalc(expr#0..7=[{inputs}], EMPNO=[$t0], DEPTNO=[$t7])
>       EnumerableTableScan(table=[[scott, EMP]])
> {code}
> , while it should be
> {code}
> EnumerableCalc(expr#0..2=[{inputs}], EMPNO=[$t1], DEPTNO=[$t0])
>   EnumerableJoin(condition=[=($0, $2)], joinType=[inner])
>     EnumerableCalc(expr#0..2=[{inputs}], DEPTNO=[$t0])
>       EnumerableTableScan(table=[[scott, DEPT]])
>     EnumerableCalc(expr#0..7=[{inputs}], EMPNO=[$t0], DEPTNO=[$t7])
>       EnumerableTableScan(table=[[scott, EMP]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to