liyang created CALCITE-853:
------------------------------

             Summary: EnumerableAggregate should take advantage of input 
collation
                 Key: CALCITE-853
                 URL: https://issues.apache.org/jira/browse/CALCITE-853
             Project: Calcite
          Issue Type: Improvement
            Reporter: liyang
            Assignee: Julian Hyde


Li Yang <[email protected]>
Aug 20 (2 days ago)
                
I encountered Out Of Mem exception when a huge result set is passed into 
EnumerableAggregate and get aggregated in memory. I'm thinking if the input is 
sorted by the group-by key, then the groupBy() don't have to hold all data in 
memory any more.

Julian Hyde <[email protected]>
2:20 PM (16 hours ago)
                
Yes, that would be useful. Please log a jira.

Enumerable.groupBy doesn't know its input's collation so can't make that 
decision, but EnumerableAggregate does. I think that EnumerableAggregate should 
have a "trigger key", a subset of its group key, and if the trigger key changes 
it will emit and flush its hash table.

As well as for your use case, it will be useful for streaming queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to