[ 
https://issues.apache.org/jira/browse/PHOENIX-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969516#comment-14969516
 ] 

Maryann Xue commented on PHOENIX-2344:
--------------------------------------

[~jamestaylor], [~julianhyde], opened this one per our discussion today.
Think we'll do the runtime in Phoenix master branch and the compilation in the 
calcite branch (if it hasn't been merged by then).

> Implement partial stream aggregate
> ----------------------------------
>
>                 Key: PHOENIX-2344
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2344
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Maryann Xue
>            Assignee: Maryann Xue
>
> We now have ordered group-by (stream aggregate) and unordered group-by (hash 
> aggregate) in Phoenix. Stream aggregate is usually much more beneficial than 
> hash aggregate in terms of memory usage and pipelining, but it requires that 
> the aggregate's input is ordered on group-by expressions, i.e. the group-by 
> expressions is the beginning part of the input's collation (ordering).
> However, we could have something in the middle, a stream/hash hybrid 
> aggregate when the group-by expressions and the input collation share some 
> common part. For example, we group table T1 by column A, B and T1 is sorted 
> on column A, C, we'll have the ordered part as A, and the hash part as B. 
> Thus within the range of a same A, a hash table is used for collecting all 
> different Bs; while at the changing point of A, we can purge the intermediate 
> hash table and feed the result for the previous A to next operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to