[
https://issues.apache.org/jira/browse/PHOENIX-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14969516#comment-14969516
]
Maryann Xue commented on PHOENIX-2344:
--------------------------------------
[~jamestaylor], [~julianhyde], opened this one per our discussion today.
Think we'll do the runtime in Phoenix master branch and the compilation in the
calcite branch (if it hasn't been merged by then).
> Implement partial stream aggregate
> ----------------------------------
>
> Key: PHOENIX-2344
> URL: https://issues.apache.org/jira/browse/PHOENIX-2344
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Maryann Xue
> Assignee: Maryann Xue
>
> We now have ordered group-by (stream aggregate) and unordered group-by (hash
> aggregate) in Phoenix. Stream aggregate is usually much more beneficial than
> hash aggregate in terms of memory usage and pipelining, but it requires that
> the aggregate's input is ordered on group-by expressions, i.e. the group-by
> expressions is the beginning part of the input's collation (ordering).
> However, we could have something in the middle, a stream/hash hybrid
> aggregate when the group-by expressions and the input collation share some
> common part. For example, we group table T1 by column A, B and T1 is sorted
> on column A, C, we'll have the ordered part as A, and the hash part as B.
> Thus within the range of a same A, a hash table is used for collecting all
> different Bs; while at the changing point of A, we can purge the intermediate
> hash table and feed the result for the previous A to next operator.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)