[ https://issues.apache.org/jira/browse/DRILL-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354230#comment-16354230 ]
ASF GitHub Bot commented on DRILL-6123: --------------------------------------- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1107#discussion_r166384067 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java --- @@ -77,7 +77,7 @@ private ExecConstants() { public static final String SPILL_DIRS = "drill.exec.spill.directories"; public static final String OUTPUT_BATCH_SIZE = "drill.exec.memory.operator.output_batch_size"; - public static final LongValidator OUTPUT_BATCH_SIZE_VALIDATOR = new RangeLongValidator(OUTPUT_BATCH_SIZE, 1024, 512 * 1024 * 1024); + public static final LongValidator OUTPUT_BATCH_SIZE_VALIDATOR = new RangeLongValidator(OUTPUT_BATCH_SIZE, 1, 512 * 1024 * 1024); --- End diff -- Maybe add a comment to explain the units here. Bytes? MB? A minimum batch size of 1 byte seems small, but a max size of 512 GB seems large, so not sure of the limits... > Limit batch size for Merge Join based on memory > ----------------------------------------------- > > Key: DRILL-6123 > URL: https://issues.apache.org/jira/browse/DRILL-6123 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow > Affects Versions: 1.12.0 > Reporter: Padma Penumarthy > Assignee: Padma Penumarthy > Priority: Major > Fix For: 1.13.0 > > > Merge join limits output batch size to 32K rows irrespective of row size. > This can create very large or very small batches (in terms of memory), > depending upon average row width. Change this to figure out output row count > based on memory specified with the new outputBatchSize option and average row > width of incoming left and right batches. Output row count will be minimum of > 1 and max of 64k. -- This message was sent by Atlassian JIRA (v7.6.3#76005)