Github user jinfengni commented on a diff in the pull request:
https://github.com/apache/drill/pull/906#discussion_r135351183
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java
---
@@ -152,97 +157,75 @@ public void kill(boolean sendUpstream) {
}
}
- private void releaseAssets() {
- container.zeroVectors();
- }
-
- private void clearFieldVectorMap() {
- for (final ValueVector v : mutator.fieldVectorMap().values()) {
- v.clear();
- }
- }
-
@Override
public IterOutcome next() {
if (done) {
return IterOutcome.NONE;
}
oContext.getStats().startProcessing();
try {
- try {
- injector.injectChecked(context.getExecutionControls(),
"next-allocate", OutOfMemoryException.class);
-
- currentReader.allocate(mutator.fieldVectorMap());
- } catch (OutOfMemoryException e) {
- clearFieldVectorMap();
- throw UserException.memoryError(e).build(logger);
- }
- while ((recordCount = currentReader.next()) == 0) {
+ while (true) {
--- End diff --
It's not entirely true that the original logic allocates a batch once per
call to next(). There are two places where allocate were called. The first one
happens outside of the while loop [1], while the second happens inside the loop
for the case where we want to skip certain record readers [2].
The new logic essentially combine the two into one, by putting the first
one inside the loop as well. In that sense, it's not doing a completely job
than the previous logic.
Regarding where the buffers are allocated if there are two passes, each
value vector releases its own buffer by calling clear(), before it allocate new
buffer. That's why the previous/new logic did not run into memory leak
problem.
1.
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java#L175
2.
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java#L214
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---