[ https://issues.apache.org/jira/browse/CALCITE-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758383#comment-17758383 ]
Zoltan Haindrich commented on CALCITE-5953: ------------------------------------------- I would also find it more natural if SUM would follow the mathematical definition (so that an empty sum is 0) I believe when `SUM` was placed next to the other aggregates like `MIN` they have made some generalizations. I was thinking about possible alternatives to fixing this: * compute it from multiple aggs {code:sql} consider: select SUM(CASE WHEN x=1 THEN 2 ELSE 0 END from t; select CASE WHEN s>0 THEN 2*s ELSE NULLIF(cnt,0) END from (select count(1) filter (where x=1) as s,count(1) as cnt from t) tt; or with ANY_VALUE: select CASE WHEN s>0 THEN 2*s ELSE a END from (select count(1) filter (where x=1) as s,any_value(0) as a from t) tt; {code} ** so we are exchanging 1 agg with a complex CASE into 2 simpler aggs + a CASE on top of it ** pro of this approach: *** it could be extended very easily to also cover the case when ELSE is not 0 *** possibly handle more branches for the same variable - at the cost of 1 more aggregation *** it could possibly extended to rewrite multi branch cases - but every branch will cost 1 more aggregation * consider only `SUM0` to optimize ** ProjectAggregateMergeRule could prepare SUM0 for this ** this fixes all above issues *** but also disables it from the more natural usages I'm still thinking about alternatives - let me know what you think about the above ; or if you have any other ideas! > AggregateCaseToFilterRule may make inaccurate SUM transformation > ---------------------------------------------------------------- > > Key: CALCITE-5953 > URL: https://issues.apache.org/jira/browse/CALCITE-5953 > Project: Calcite > Issue Type: Bug > Reporter: Zoltan Haindrich > Assignee: Zoltan Haindrich > Priority: Major > > consider: {{sum(case when x = 1 then 2 else 0 end) as b}} > notice that this expression may only be null if there are no rows in the table > {{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(1) > filter (where x=2)}} which broadens when it could be `null` to when there are > no matches to the filter > * *A* is *0* correctly in this case; but I think it will be still *0* in case > there are 0 input rows > * The result for *B* supposed to be *0* but since there are no matches by the > filter it becomes *null* > * *C* is not touched > {code} > # Convert CASE to FILTER without matches > select sum(case when x = 1 then 1 else 0 end) as a, > sum(case when x = 1 then 2 else 0 end) as b, > sum(case when x = 1 then 3 else -1 end) as c > from (values 0, null, 0, 2) as t(x); > +---+---+----+ > | A | B | C | > +---+---+----+ > | 0 | 0 | -4 | > +---+---+----+ > (1 row) > !ok > EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], > B=[$t1], C=[$t2]) > EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER > $3], C=[SUM($0)]) > EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], > expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], > expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6]) > EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]]) > !plan > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)