[ 
https://issues.apache.org/jira/browse/BEAM-8042?focusedWorklogId=375878&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-375878
 ]

ASF GitHub Bot logged work on BEAM-8042:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jan/20 21:05
            Start Date: 22/Jan/20 21:05
    Worklog Time Spent: 10m 
      Work Description: 11moon11 commented on pull request #10649: [BEAM-8042] 
[ZetaSQL] Fix aggregate column reference
URL: https://github.com/apache/beam/pull/10649#discussion_r369802518
 
 

 ##########
 File path: 
sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/translation/AggregateScanConverter.java
 ##########
 @@ -88,8 +88,13 @@ public RelNode convert(ResolvedAggregateScan zetaNode, 
List<RelNode> inputs) {
       // For aggregate calls, their input ref follow after GROUP BY input ref.
       int columnRefoff = groupFieldsListSize;
       for (ResolvedComputedColumn computedColumn : 
zetaNode.getAggregateList()) {
-        aggregateCalls.add(convertAggCall(computedColumn, columnRefoff));
-        columnRefoff++;
+        AggregateCall aggCall = convertAggCall(computedColumn, columnRefoff);
+        aggregateCalls.add(aggCall);
+        if (!aggCall.getArgList().isEmpty()) {
+          // Only increment column reference offset when aggregates use them 
(BEAM-8042).
+          // Ex: COUNT(*) does not have arguments, while COUNT(`field`) does.
+          columnRefoff++;
 
 Review comment:
   My understanding is that the boolean expression for `COUNTIF` aggregate will 
be computed in a precursory `Project`.
   As of right now list of supported aggregates is limited:
   
https://github.com/apache/beam/blob/659d84b4be5bdd36b408359a8c69f4eb12771180/sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/SqlStdOperatorMappingTable.java#L181-L189
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 375878)
    Time Spent: 2h 10m  (was: 2h)

> Parsing of aggregate query fails
> --------------------------------
>
>                 Key: BEAM-8042
>                 URL: https://issues.apache.org/jira/browse/BEAM-8042
>             Project: Beam
>          Issue Type: Sub-task
>          Components: dsl-sql-zetasql
>            Reporter: Rui Wang
>            Assignee: Kirill Kozlov
>            Priority: Critical
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> {code}
>   @Rule
>   public TestPipeline pipeline = 
> TestPipeline.fromOptions(createPipelineOptions());
>   private static PipelineOptions createPipelineOptions() {
>     BeamSqlPipelineOptions opts = 
> PipelineOptionsFactory.create().as(BeamSqlPipelineOptions.class);
>     opts.setPlannerName(ZetaSQLQueryPlanner.class.getName());
>     return opts;
>   }
>   @Test
>   public void testAggregate() {
>     Schema inputSchema = Schema.builder()
>         .addByteArrayField("id")
>         .addInt64Field("has_f1")
>         .addInt64Field("has_f2")
>         .addInt64Field("has_f3")
>         .addInt64Field("has_f4")
>         .addInt64Field("has_f5")
>         .addInt64Field("has_f6")
>         .build();
>     String sql = "SELECT \n" +
>         "  id, \n" +
>         "  COUNT(*) as count, \n" +
>         "  SUM(has_f1) as f1_count, \n" +
>         "  SUM(has_f2) as f2_count, \n" +
>         "  SUM(has_f3) as f3_count, \n" +
>         "  SUM(has_f4) as f4_count, \n" +
>         "  SUM(has_f5) as f5_count, \n" +
>         "  SUM(has_f6) as f6_count  \n" +
>         "FROM PCOLLECTION \n" +
>         "GROUP BY id";
>     pipeline
>         .apply(Create.empty(inputSchema))
>         .apply(SqlTransform.query(sql));
>     pipeline.run();
>   }
> {code}
> {code}
> Caused by: java.lang.RuntimeException: Error while applying rule 
> AggregateProjectMergeRule, args 
> [rel#553:LogicalAggregate.NONE(input=RelSubset#552,group={0},f1=COUNT(),f2=SUM($2),f3=SUM($3),f4=SUM($4),f5=SUM($5),f6=SUM($6),f7=SUM($7)),
>  
> rel#551:LogicalProject.NONE(input=RelSubset#550,key=$0,f1=$1,f2=$2,f3=$3,f4=$4,f5=$5,f6=$6)]
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:232)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:637)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:340)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.transform(ZetaSQLPlannerImpl.java:168)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.parseQuery(ZetaSQLQueryPlanner.java:99)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.parseQuery(ZetaSQLQueryPlanner.java:87)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel(ZetaSQLQueryPlanner.java:66)
>       at 
> org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery(BeamSqlEnv.java:104)
>       at 
>       ... 39 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
>       at 
> org.apache.beam.repackaged.sql.com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:58)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.rel.rules.AggregateProjectMergeRule.apply(AggregateProjectMergeRule.java:96)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.rel.rules.AggregateProjectMergeRule.onMatch(AggregateProjectMergeRule.java:73)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:205)
>       ... 48 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to