[ https://issues.apache.org/jira/browse/DRILL-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453013#comment-16453013 ]
ASF GitHub Bot commented on DRILL-6307: --------------------------------------- Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/1228#discussion_r184192443 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java --- @@ -50,7 +50,7 @@ public class RecordBatchSizer { private static final int OFFSET_VECTOR_WIDTH = UInt4Vector.VALUE_WIDTH; private static final int BIT_VECTOR_WIDTH = UInt1Vector.VALUE_WIDTH; - private static final int STD_REPETITION_FACTOR = 10; + public static final int STD_REPETITION_FACTOR = 10; --- End diff -- done. using 5 in both places now. > Handle empty batches in record batch sizer correctly > ---------------------------------------------------- > > Key: DRILL-6307 > URL: https://issues.apache.org/jira/browse/DRILL-6307 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow > Affects Versions: 1.13.0 > Reporter: Padma Penumarthy > Assignee: Padma Penumarthy > Priority: Major > Fix For: 1.14.0 > > > when we get empty batch, record batch sizer calculates row width as zero. In > that case, we do not do accounting and memory allocation correctly for > outgoing batches. > For example, in merge join, for outer left join, if right side batch is > empty, we still have to include the right side columns as null in outgoing > batch. > Say first batch is empty. Then, for outgoing, we allocate empty vectors with > zero capacity. When we read the next batch with data, we will end up going > through realloc loop. If we use right side row width as 0 in outgoing row > width calculation, number of rows we will calculate will be higher and later > when we get a non empty batch, we might exceed the memory limits. > One possible workaround/solution : Allocate memory based on std size for > empty input batch. Use allocation width as width of the batch in number of > rows calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)