Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/906#discussion_r135351183 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java --- @@ -152,97 +157,75 @@ public void kill(boolean sendUpstream) { } } - private void releaseAssets() { - container.zeroVectors(); - } - - private void clearFieldVectorMap() { - for (final ValueVector v : mutator.fieldVectorMap().values()) { - v.clear(); - } - } - @Override public IterOutcome next() { if (done) { return IterOutcome.NONE; } oContext.getStats().startProcessing(); try { - try { - injector.injectChecked(context.getExecutionControls(), "next-allocate", OutOfMemoryException.class); - - currentReader.allocate(mutator.fieldVectorMap()); - } catch (OutOfMemoryException e) { - clearFieldVectorMap(); - throw UserException.memoryError(e).build(logger); - } - while ((recordCount = currentReader.next()) == 0) { + while (true) { --- End diff -- It's not entirely true that the original logic allocates a batch once per call to next(). There are two places where allocate were called. The first one happens outside of the while loop [1], while the second happens inside the loop for the case where we want to skip certain record readers [2]. The new logic essentially combine the two into one, by putting the first one inside the loop as well. In that sense, it's not doing a completely job than the previous logic. Regarding where the buffers are allocated if there are two passes, each value vector releases its own buffer by calling clear(), before it allocate new buffer. That's why the previous/new logic did not run into memory leak problem. 1. https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java#L175 2. https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java#L214
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---