[jira] [Commented] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty

2023-11-08 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17784096#comment-17784096
 ] 

Ruben Q L commented on CALCITE-6087:


I have the impression that both the current ticket and CALCITE-6089 appear 
because we try to apply EnumerableSortedAggregate on an aggregation with empty 
groupSet. 

I'm not sure if EnumerableSortedAggregate implementation is supposed to support 
this case, but even if it was, it seems to me that applying a sorted aggregated 
on an empty groupSet kind of defeats its purpose, because there will be no 
sorted input for the aggregation so, if I am not mistaken, 
EnumerableSortedAggregate behavior would "degenerate" into the standard 
EnumerableAggregate.

Perhaps the simpler solution would be to simply avoid it on 
EnumerableSortedAggregateRule:
{code}
@Override public @Nullable RelNode convert(RelNode rel) {
  final Aggregate agg = (Aggregate) rel;
  if (!Aggregate.isSimple(agg)) {
return null;
  }
  ...
=>
@Override public @Nullable RelNode convert(RelNode rel) {
  final Aggregate agg = (Aggregate) rel;
  if (!Aggregate.isSimple(agg) || agg.getGroupSet().isEmpty()) {  // <-- ***
return null;
  }
  ...
{code}

[~hyuan], [~amaliujia], you worked on the original implem of 
EnumerableSortedAggregate + EnumerableSortedAggregateRule, wdyt?

> EnumerableSortedAggregate returns incorrect result when input is empty
> --
>
> Key: CALCITE-6087
> URL: https://issues.apache.org/jira/browse/CALCITE-6087
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Priority: Major
>
> Performing a MAX on an empty table (or on a table where we apply a filter 
> which does not return any value) shall return NULL, we can verify this with 
> our "standard" {{EnumerableAggregate}} operator:
> {code:java}
>   @Test void enumerableAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .explainContains(
> "EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> However, if we use {{Hook.PLANNER}} to force the aggregation to be 
> implemented via {{EnumerableSortedAggregate}} on the same query:
> {code:java}
>   @Test void enumerableSortedAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .withHook(Hook.PLANNER, (Consumer) planner -> {
>   planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
>   planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
> })
> .explainContains(
> "EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> It fails, because the {{EnumerableSortedAggregate}} returns an empty result 
> set rather than NULL:
> {noformat}
> java.lang.AssertionError: 
> Expected: "m=null"
>  but: was ""
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-6087) EnumerableSortedAggregate returns incorrect result when input is empty

2023-11-02 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782234#comment-17782234
 ] 

Ruben Q L commented on CALCITE-6087:


Same thing happens if we perform a COUNT on an empty input (which shall return 
0):
{code}
  @Test void enumerableAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select count(1) as c from emps where deptno>100")
.explainContains(
"EnumerableAggregate(group=[{}], c=[COUNT()])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("c=0");  // ok
  }

  @Test void enumerableSortedAggregateOnEmptyInput() {
tester(false, new HrSchema())
.query("select count(1) as c from emps where deptno>100")
.withHook(Hook.PLANNER, (Consumer) planner -> {
  planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
  planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
})
.explainContains(
"EnumerableSortedAggregate(group=[{}], c=[COUNT()])\n"
+ "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
+ "EnumerableTableScan(table=[[s, emps]])")
.returnsOrdered("c=0");  // KO!!! Returns empty!!
  }
{code}

The second test (EnumerableSortedAggregate) will fail with:
{noformat}
java.lang.AssertionError: 
Expected: "c=0"
 but: was ""
{noformat}

> EnumerableSortedAggregate returns incorrect result when input is empty
> --
>
> Key: CALCITE-6087
> URL: https://issues.apache.org/jira/browse/CALCITE-6087
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Ruben Q L
>Priority: Major
>
> Performing a MAX on an empty table (or on a table where we apply a filter 
> which does not return any value) shall return NULL, we can verify this with 
> our "standard" {{EnumerableAggregate}} operator:
> {code:java}
>   @Test void enumerableAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .explainContains(
> "EnumerableAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> However, if we use {{Hook.PLANNER}} to force the aggregation to be 
> implemented via {{EnumerableSortedAggregate}} on the same query:
> {code:java}
>   @Test void enumerableSortedAggregateOnEmptyInput() {
> tester(false, new HrSchema())
> .query("select max(deptno) as m from emps where deptno>100")
> .withHook(Hook.PLANNER, (Consumer) planner -> {
>   planner.removeRule(EnumerableRules.ENUMERABLE_AGGREGATE_RULE);
>   planner.addRule(EnumerableRules.ENUMERABLE_SORTED_AGGREGATE_RULE);
> })
> .explainContains(
> "EnumerableSortedAggregate(group=[{}], m=[MAX($1)])\n"
> + "  EnumerableCalc(expr#0..4=[{inputs}], expr#5=[100], 
> expr#6=[>($t1, $t5)], proj#0..4=[{exprs}], $condition=[$t6])\n"
> + "EnumerableTableScan(table=[[s, emps]])")
> .returnsOrdered("m=null");
>   }
> {code}
> It fails, because the {{EnumerableSortedAggregate}} returns an empty result 
> set rather than NULL:
> {noformat}
> java.lang.AssertionError: 
> Expected: "m=null"
>  but: was ""
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)