Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1058#discussion_r154477573
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/SpillSet.java
 ---
    @@ -331,6 +337,43 @@ public long getReadBytes(InputStream inputStream) {
         }
       }
     
    +  private static class WritableByteChannelImpl implements 
WritableByteChannel
    +  {
    +    private static final int TRANSFER_SIZE = 32 * 1024;
    +
    +    private OutputStream out;
    +    private final byte buffer[] = new byte[TRANSFER_SIZE];
    --- End diff --
    
    Unfortunately, this is one of the problems that the current design was 
intended to solve. As we can see in the hash agg code; multiple of these 
objects are active in memory at any one time. But, no more than one is ever 
writing. By putting the buffer here, we create one per (potentially thousands) 
of retained channels. But, by putting the buffer on the allocator, we hold one 
buffer per operator.
    
    Being the guy who wrote the current allocator code, I realize it is clunky. 
But, it is clunky because we favored memory frugality over elegance. (If there 
is a cleaner way to have one buffer per thread, I'm all for it!)
    
    Isn't the purpose of this change to avoid memory buffers? Seems, for the 
above reasons, we're moving backwards...


---

Reply via email to