[ 
https://issues.apache.org/jira/browse/CALCITE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033455#comment-16033455
 ] 

Julian Hyde commented on CALCITE-1803:
--------------------------------------

Since Calcite rewrites {{AVG(x)}} to {{SUM(x) / COUNT(x)}}, this change should 
enable

{code}SELECT "store_state", AVG("unit_sales") FROM "foodmart" GROUP BY 
"store_state"{code}

to be pushed down to Druid in its entirety.

> Push Project that follows Aggregate down to Druid
> -------------------------------------------------
>
>                 Key: CALCITE-1803
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1803
>             Project: Calcite
>          Issue Type: New Feature
>          Components: druid
>    Affects Versions: 1.11.0
>            Reporter: Junxian Wu
>            Assignee: Julian Hyde
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Druid post aggregations are not supported when parsing SQL queries. By 
> implementing post aggregations, we can offload some computation to the druid 
> cluster rather than aggregate on the client side.
> Example usage:
> {{SELECT SUM("column1") - SUM("column2") FROM "table";}}
> This query will be parsed into two separate Druid aggregations according to 
> current rules. Then the results will be subtracted in Calcite. By using the 
> {{postAggregations}} field in the druid query, the subtraction could be done 
> in Druid cluster. Although the previous example is simple, the difference 
> will be obvious when the number of result rows are large. (Multiple rows 
> result will happen when group by is used).
> Questions:
> After I push Post aggregation into Druid query, what should I change on the 
> project relational correlation? In the case of the example above, the 
> {{BindableProject}} will have the expression to representation the 
> subtraction. If I push the post aggregation into druid query, the expression 
> of subtraction should be replaced by the representation of the post 
> aggregations result. For now, the project expression seems can only point to 
> the aggregations results. Since post aggregations have to point to 
> aggregations results too, it could not be placed in the parallel level as 
> aggregation. Where should I put post aggregations?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to