Github user ppadma commented on a diff in the pull request:
https://github.com/apache/drill/pull/1228#discussion_r184202508
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/record/RecordBatchSizer.java
---
@@ -536,6 +556,11 @@ public ColumnSize getColumn(String name) {
*/
private int netRowWidth;
private int netRowWidthCap50;
+
+ /**
+ * actual row size if input is not empty. Otherwise, standard size.
+ */
+ private int rowAllocSize;
--- End diff --
This is not just a problem with size estimation for vector memory
allocation. Let us say one side of join receives an empty batch as first
batch. If we use row width as 0 in outgoing row width calculation, number of
rows (to include in the outgoing batch) we will calculate will be higher and
later when we get a non empty batch, we might exceed the memory limits.
---