[jira] [Work logged] (BEAM-8042) Parsing of aggregate query fails

ASF GitHub Bot (Jira) Wed, 22 Jan 2020 09:01:19 -0800


     [ 
https://issues.apache.org/jira/browse/BEAM-8042?focusedWorklogId=375737&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-375737
 ]


ASF GitHub Bot logged work on BEAM-8042:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jan/20 17:00
            Start Date: 22/Jan/20 17:00
    Worklog Time Spent: 10m 
      Work Description: kanterov commented on pull request #10649: [BEAM-8042] 
[ZetaSQL] Fix aggregate column reference
URL: https://github.com/apache/beam/pull/10649#discussion_r369684211
 
 

 ##########
 File path: 
sdks/java/extensions/sql/zetasql/src/main/java/org/apache/beam/sdk/extensions/sql/zetasql/translation/AggregateScanConverter.java
 ##########
 @@ -88,8 +88,13 @@ public RelNode convert(ResolvedAggregateScan zetaNode, 
List<RelNode> inputs) {
       // For aggregate calls, their input ref follow after GROUP BY input ref.
       int columnRefoff = groupFieldsListSize;
       for (ResolvedComputedColumn computedColumn : 
zetaNode.getAggregateList()) {
-        aggregateCalls.add(convertAggCall(computedColumn, columnRefoff));
-        columnRefoff++;
+        AggregateCall aggCall = convertAggCall(computedColumn, columnRefoff);
+        aggregateCalls.add(aggCall);
+        if (!aggCall.getArgList().isEmpty()) {
+          // Only increment column reference offset when aggregates use them 
(BEAM-8042).
+          // Ex: COUNT(*) does not have arguments, while COUNT(`field`) does.
+          columnRefoff++;
 
 Review comment:
   What happens if there is an aggregate function with more then a single 
argument, for instance, `COUNTIF`?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 375737)
    Time Spent: 1h 50m  (was: 1h 40m)

> Parsing of aggregate query fails
> --------------------------------
>
>                 Key: BEAM-8042
>                 URL: https://issues.apache.org/jira/browse/BEAM-8042
>             Project: Beam
>          Issue Type: Sub-task
>          Components: dsl-sql-zetasql
>            Reporter: Rui Wang
>            Assignee: Kirill Kozlov
>            Priority: Critical
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code}
>   @Rule
>   public TestPipeline pipeline = 
> TestPipeline.fromOptions(createPipelineOptions());
>   private static PipelineOptions createPipelineOptions() {
>     BeamSqlPipelineOptions opts = 
> PipelineOptionsFactory.create().as(BeamSqlPipelineOptions.class);
>     opts.setPlannerName(ZetaSQLQueryPlanner.class.getName());
>     return opts;
>   }
>   @Test
>   public void testAggregate() {
>     Schema inputSchema = Schema.builder()
>         .addByteArrayField("id")
>         .addInt64Field("has_f1")
>         .addInt64Field("has_f2")
>         .addInt64Field("has_f3")
>         .addInt64Field("has_f4")
>         .addInt64Field("has_f5")
>         .addInt64Field("has_f6")
>         .build();
>     String sql = "SELECT \n" +
>         "  id, \n" +
>         "  COUNT(*) as count, \n" +
>         "  SUM(has_f1) as f1_count, \n" +
>         "  SUM(has_f2) as f2_count, \n" +
>         "  SUM(has_f3) as f3_count, \n" +
>         "  SUM(has_f4) as f4_count, \n" +
>         "  SUM(has_f5) as f5_count, \n" +
>         "  SUM(has_f6) as f6_count  \n" +
>         "FROM PCOLLECTION \n" +
>         "GROUP BY id";
>     pipeline
>         .apply(Create.empty(inputSchema))
>         .apply(SqlTransform.query(sql));
>     pipeline.run();
>   }
> {code}
> {code}
> Caused by: java.lang.RuntimeException: Error while applying rule 
> AggregateProjectMergeRule, args 
> [rel#553:LogicalAggregate.NONE(input=RelSubset#552,group={0},f1=COUNT(),f2=SUM($2),f3=SUM($3),f4=SUM($4),f5=SUM($5),f6=SUM($6),f7=SUM($7)),
>  
> rel#551:LogicalProject.NONE(input=RelSubset#550,key=$0,f1=$1,f2=$2,f3=$3,f4=$4,f5=$5,f6=$6)]
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:232)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:637)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:340)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.transform(ZetaSQLPlannerImpl.java:168)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.parseQuery(ZetaSQLQueryPlanner.java:99)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.parseQuery(ZetaSQLQueryPlanner.java:87)
>       at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel(ZetaSQLQueryPlanner.java:66)
>       at 
> org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery(BeamSqlEnv.java:104)
>       at 
>       ... 39 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
>       at 
> org.apache.beam.repackaged.sql.com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:58)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.rel.rules.AggregateProjectMergeRule.apply(AggregateProjectMergeRule.java:96)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.rel.rules.AggregateProjectMergeRule.onMatch(AggregateProjectMergeRule.java:73)
>       at 
> org.apache.beam.repackaged.sql.org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:205)
>       ... 48 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-8042) Parsing of aggregate query fails

Reply via email to