[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

ppadma Mon, 05 Feb 2018 15:15:06 -0800

Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1101#discussion_r166141274
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
    @@ -733,28 +780,32 @@ private void restoreReservedMemory() {
        * @param records
        */
       private void allocateOutgoing(int records) {
    -    // Skip the keys and only allocate for outputting the workspace values
    -    // (keys will be output through splitAndTransfer)
    -    Iterator<VectorWrapper<?>> outgoingIter = outContainer.iterator();
    -    for (int i = 0; i < numGroupByOutFields; i++) {
    -      outgoingIter.next();
    -    }
    -
         // try to preempt an OOM by using the reserved memory
         useReservedOutgoingMemory();
         long allocatedBefore = allocator.getAllocatedMemory();
     
    -    while (outgoingIter.hasNext()) {
    +    for (int columnIndex = numGroupByOutFields; columnIndex < 
outContainer.getNumberOfColumns(); columnIndex++) {
    +      final VectorWrapper wrapper = 
outContainer.getValueVector(columnIndex);
           @SuppressWarnings("resource")
    -      ValueVector vv = outgoingIter.next().getValueVector();
    +      final ValueVector vv = wrapper.getValueVector();
     
    -      AllocationHelper.allocatePrecomputedChildCount(vv, records, 
maxColumnWidth, 0);
    +      final RecordBatchSizer.ColumnSize columnSizer = new 
RecordBatchSizer.ColumnSize(wrapper.getValueVector());
    +      int columnSize;
    +
    +      if (columnSizer.hasKnownSize()) {
    +        // For fixed width vectors we know the size of each record
    +        columnSize = columnSizer.getKnownSize();
    +      } else {
    +        // For var chars we need to use the input estimate
    +        columnSize = varcharValueSizes.get(columnIndex);
    +      }
    +
    +      AllocationHelper.allocatePrecomputedChildCount(vv, records, 
columnSize, 0);
    --- End diff --
    
    I think we should also get elementCount from sizer and use that instead of 
passing 0.

---

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

Reply via email to