[jira] [Commented] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty
[ https://issues.apache.org/jira/browse/CALCITE-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784096#comment-17784096 ] Ruben Q L commented on CALCITE-6087: I have the impression that both the current ticket and CALCITE-6089 appear because we try to apply EnumerableSortedAggregate on an aggregation with empty groupSet. I'm not sure if EnumerableSortedAggregate implementation is supposed to support this case, but even if it was, it seems to me that applying a sorted aggregated on an empty groupSet kind of defeats its purpose, because there will be no sorted input for the aggregation so, if I am not mistaken, EnumerableSortedAggregate behavior would "degenerate" into the standard EnumerableAggregate. Perhaps the simpler solution would be to simply avoid it on EnumerableSortedAggregateRule: {code} @Override public @Nullable RelNode convert(RelNode rel) { final Aggregate agg = (Aggregate) rel; if (!Aggregate.isSimple(agg)) { return null; } ... => @Override public @Nullable RelNode convert(RelNode rel) { final Aggregate agg = (Aggregate) rel; if (!Aggregate.isSimple(agg) || agg.getGroupSet().isEmpty()) { // <-- *** return null; } ... {code} [~hyuan], [~amaliujia], you worked on the original implem of EnumerableSortedAggregate + EnumerableSortedAggregateRule, wdyt? > EnumerableSortedAggregate returns incorrect result when input is empty > -- > > Key: CALCITE-6087 > URL: https://issues.apache.org/jira/browse/CALCITE-6087 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.35.0 >Reporter: Ruben Q L >Priority: Major > > Performing a MAX on an empty table (or on a table where we apply a filter > which does not return any value) shall return NULL, we can verify this with > our "standard" {{EnumerableAggregate}} operator: > {code:java} > @Test void enumerableAggregateOnEmptyInput() { > tester(false, new HrSchema()) > .query("select max(deptno) as m from emps where deptno>100") > .explainContains( > "EnumerableAggregate(group=[{}], m=[MAX($1)])\n" > + " EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], > expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n" > + "EnumerableTableScan(table=[[s, emps]])") > .returnsOrdered("m=null"); > } > {code} > However, if we use {{Hook.PLANNER}} to force the aggregation to be > implemented via {{EnumerableSortedAggregate}} on the same query: > {code:java} > @Test void enumerableSortedAggregateOnEmptyInput() { > tester(false, new HrSchema()) > .query("select max(deptno) as m from emps where deptno>100") > .withHook(Hook.PLANNER, (Consumer) planner -> { > planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE); > planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE); > }) > .explainContains( > "EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n" > + " EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], > expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n" > + "EnumerableTableScan(table=[[s, emps]])") > .returnsOrdered("m=null"); > } > {code} > It fails, because the {{EnumerableSortedAggregate}} returns an empty result > set rather than NULL: > {noformat} > java.lang.AssertionError: > Expected: "m=null" > but: was "" > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty
[ https://issues.apache.org/jira/browse/CALCITE-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782234#comment-17782234 ] Ruben Q L commented on CALCITE-6087: Same thing happens if we perform a COUNT on an empty input (which shall return 0): {code} @Test void enumerableAggregateOnEmptyInput() { tester(false, new HrSchema()) .query("select count(1) as c from emps where deptno>100") .explainContains( "EnumerableAggregate(group=[{}], c=[COUNT()])\n" + " EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n" + "EnumerableTableScan(table=[[s, emps]])") .returnsOrdered("c=0"); // ok } @Test void enumerableSortedAggregateOnEmptyInput() { tester(false, new HrSchema()) .query("select count(1) as c from emps where deptno>100") .withHook(Hook.PLANNER, (Consumer) planner -> { planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE); planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE); }) .explainContains( "EnumerableSortedAggregate(group=[{}], c=[COUNT()])\n" + " EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n" + "EnumerableTableScan(table=[[s, emps]])") .returnsOrdered("c=0"); // KO!!! Returns empty!! } {code} The second test (EnumerableSortedAggregate) will fail with: {noformat} java.lang.AssertionError: Expected: "c=0" but: was "" {noformat} > EnumerableSortedAggregate returns incorrect result when input is empty > -- > > Key: CALCITE-6087 > URL: https://issues.apache.org/jira/browse/CALCITE-6087 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.35.0 >Reporter: Ruben Q L >Priority: Major > > Performing a MAX on an empty table (or on a table where we apply a filter > which does not return any value) shall return NULL, we can verify this with > our "standard" {{EnumerableAggregate}} operator: > {code:java} > @Test void enumerableAggregateOnEmptyInput() { > tester(false, new HrSchema()) > .query("select max(deptno) as m from emps where deptno>100") > .explainContains( > "EnumerableAggregate(group=[{}], m=[MAX($1)])\n" > + " EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], > expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n" > + "EnumerableTableScan(table=[[s, emps]])") > .returnsOrdered("m=null"); > } > {code} > However, if we use {{Hook.PLANNER}} to force the aggregation to be > implemented via {{EnumerableSortedAggregate}} on the same query: > {code:java} > @Test void enumerableSortedAggregateOnEmptyInput() { > tester(false, new HrSchema()) > .query("select max(deptno) as m from emps where deptno>100") > .withHook(Hook.PLANNER, (Consumer) planner -> { > planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE); > planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE); > }) > .explainContains( > "EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n" > + " EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], > expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n" > + "EnumerableTableScan(table=[[s, emps]])") > .returnsOrdered("m=null"); > } > {code} > It fails, because the {{EnumerableSortedAggregate}} returns an empty result > set rather than NULL: > {noformat} > java.lang.AssertionError: > Expected: "m=null" > but: was "" > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)