Github user StevenMPhillips commented on a diff in the pull request:
https://github.com/apache/drill/pull/317#discussion_r48902637
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
---
@@ -588,36 +571,29 @@ public BatchGroup
mergeAndSpill(LinkedList<BatchGroup> batchGroups) throws Schem
} finally {
hyperBatch.clear();
}
- long bufSize = getBufferSize(c1);
- totalSizeInMemory += bufSize;
- logger.debug("mergeAndSpill: final total size in memory = {}",
totalSizeInMemory);
+ logger.debug("mergeAndSpill: final total size in memory = {}",
oAllocator.getAllocatedMemory());
logger.info("Completed spilling to {}", outputFile);
return newGroup;
}
- private long getBufferSize(VectorAccessible batch) {
- long size = 0;
- for (VectorWrapper<?> w : batch) {
- DrillBuf[] bufs = w.getValueVector().getBuffers(false);
- for (DrillBuf buf : bufs) {
- size += buf.getPossibleMemoryConsumed();
- }
- }
- return size;
- }
-
private SelectionVector2 newSV2() throws OutOfMemoryException,
InterruptedException {
- SelectionVector2 sv2 = new SelectionVector2(oContext.getAllocator());
+ SelectionVector2 sv2 = new SelectionVector2(oAllocator);
if (!sv2.allocateNewSafe(incoming.getRecordCount())) {
try {
- // Not being able to allocate sv2 means this operator's allocator
reached it's maximum capacity.
- // Spilling this.batchGroups won't help here as they are owned by
upstream operator,
- // but spilling spilledBatchGroups may free enough memory
- final BatchGroup merged = mergeAndSpill(spilledBatchGroups);
+ final BatchGroup merged = mergeAndSpill(batchGroups);
if (merged != null) {
- spilledBatchGroups.addFirst(merged);
+ spilledBatchGroups.add(merged);
--- End diff --
It depends on which queue we are merging from. When batches first come in
to the ExternalSort, they are added to the batchGroup queue. Usually when
spilling, we spill some number of batches from the batchGroup queue, and then
put the result at the end of the spilledBatchGroup queue. In the cases where we
need to re-merge the batchGroups, we take from the end of the spilledBatchGroup
queue, and add them to the front of the queue.
If you are seeing a place where we mergeAndSpill the spilledBatchGroup
queue and add to the end of the queue, that is a mistake and should be fixed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---