[jira] [Commented] (CALCITE-3830) The ‘approximate’ field should be considered when computing the digest of AggregateCall

Shuo Cheng (Jira) Wed, 26 Feb 2020 21:48:17 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046184#comment-17046184
 ]


Shuo Cheng commented on CALCITE-3830:
-------------------------------------

[~danny0405], I've created a pr, please have a look.

> The ‘approximate’ field should be considered when computing the digest of 
> AggregateCall
> ---------------------------------------------------------------------------------------
>
>                 Key: CALCITE-3830
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3830
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.21.0
>            Reporter: Shuo Cheng
>            Assignee: Shuo Cheng
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.22.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In planner optimization, the digest  of Aggregate node contains digest of its 
> AggregateCall, i.e. Aggregate.toString, but currently 'approximate' filed of 
> AggregateCall is not considered in toString() method, which may lead to the 
> situation two different relNodes are considered as identical in planner 
> optimizing phase. 
> Here is an example:
> {code:java}
> // SQL
> select * from (
>   select a, count(distinct b) from T group by a
>   union all
>   select a, approx_count_distinct(b) from T group by a
> )
> // after applying a rule, the plan is
> LogicalSink(name=[_DataStreamTable_1], fields=[a, EXPR$1], __id__=[96])
> +- LogicalProject(a=[$0], EXPR$1=[$1], __id__=[94])
>    +- LogicalUnion(all=[true], __id__=[92])
>       :- LogicalAggregate(group=[{0}], EXPR$1=[COUNT(DISTINCT $1)], 
> __id__=[89])
>       :  +- LogicalTableScan(table=[[default, _DataStreamTable_2]], 
> __id__=[100])
>       +- LogicalAggregate(group=[{0}], EXPR$1=[COUNT(DISTINCT $1)], 
> __id__=[89])
>          +- LogicalTableScan(table=[[default, _DataStreamTable_2]], 
> __id__=[100])
> {code}
> As showing in the example, after optimizing, these two Aggregates are 
> considered as identical (both with 89 as relNode ID).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CALCITE-3830) The ‘approximate’ field should be considered when computing the digest of AggregateCall

Reply via email to