[ https://issues.apache.org/jira/browse/DRILL-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379721#comment-16379721 ]
ASF GitHub Bot commented on DRILL-6032: --------------------------------------- Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1101#discussion_r171133616 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -427,8 +427,8 @@ drill.exec.options: { exec.enable_union_type: false, exec.errors.verbose: false, exec.hashagg.mem_limit: 0, - exec.hashagg.min_batches_per_partition: 2, - exec.hashagg.num_partitions: 32, + exec.hashagg.min_batches_per_partition: 1, --- End diff -- @Ben-Zvi This setting controls the minimum number of batches kept in memory per partition. Making this larger will cause us to consume more memory. Making it smaller makes us consume less memory. Also in general the purpose of this PR was to make the memory calculations more precise and deterministic and it passes all regression tests. > Use RecordBatchSizer to estimate size of columns in HashAgg > ----------------------------------------------------------- > > Key: DRILL-6032 > URL: https://issues.apache.org/jira/browse/DRILL-6032 > Project: Apache Drill > Issue Type: Improvement > Reporter: Timothy Farkas > Assignee: Timothy Farkas > Priority: Major > Fix For: 1.13.0 > > > We need to use the RecordBatchSize to estimate the size of columns in the > Partition batches created by HashAgg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)