[ 
https://issues.apache.org/jira/browse/DRILL-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514321#comment-16514321
 ] 

ASF GitHub Bot commented on DRILL-6310:
---------------------------------------

ppadma opened a new pull request #1324: DRILL-6310: limit batch size for hash 
aggregate
URL: https://github.com/apache/drill/pull/1324
 
 
   batch sizing for hash aggregate is done by changing sizes of batches we are 
holding in hash aggregate partitions for aggr values and hash table for keys. 
Earlier, batch size was always 64K rows. Now, these batches will be sized for 
16MB based on actual data. 
   This means that the way we index the batches and rows with in the batch have 
to change. This is done by saving size of the batch in batchHolder. For given 
keys, based on hash value, sizes of the batches saved in batch holders is used 
to figure out the batch number and row with in the batch.  
   
   Also, for figuring out sizing information for outgoing values columns, a new 
map which maintains mapping between input and output columns is added. 
   
   This PR also has fix for 
   DRILL-6499: No need to calculate stdRowWidth for every batch by default.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> limit batch size for hash aggregate
> -----------------------------------
>
>                 Key: DRILL-6310
>                 URL: https://issues.apache.org/jira/browse/DRILL-6310
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.13.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>            Priority: Major
>             Fix For: 1.14.0
>
>
> limit batch size for hash aggregate based on memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to