[ 
https://issues.apache.org/jira/browse/DRILL-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275249#comment-16275249
 ] 

ASF GitHub Bot commented on DRILL-6002:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1058#discussion_r154475785
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/cache/VectorSerializer.java 
---
    @@ -62,27 +72,65 @@ public Writer write(VectorAccessible va) throws 
IOException {
     
         @SuppressWarnings("resource")
         public Writer write(VectorAccessible va, SelectionVector2 sv2) throws 
IOException {
    +      checkNotNull(va);
           WritableBatch batch = WritableBatch.getBatchNoHVWrap(
               va.getRecordCount(), va, sv2 != null);
           return write(batch, sv2);
         }
     
         public Writer write(WritableBatch batch, SelectionVector2 sv2) throws 
IOException {
    -      VectorAccessibleSerializable vas;
    -      if (sv2 == null) {
    -        vas = new VectorAccessibleSerializable(batch, allocator);
    -      } else {
    -        vas = new VectorAccessibleSerializable(batch, sv2, allocator);
    -      }
    -      if (retain) {
    -        vas.writeToStreamAndRetain(stream);
    -      } else {
    -        vas.writeToStream(stream);
    +      return write(batch, sv2, false);
    +    }
    +
    +    public Writer write(WritableBatch batch, SelectionVector2 sv2, boolean 
retain) throws IOException {
    +      checkNotNull(batch);
    +      checkNotNull(channel);
    +      final Timer.Context timerContext = 
metrics.timer(WRITER_TIMER).time();
    +
    +      final DrillBuf[] incomingBuffers = batch.getBuffers();
    +      final UserBitShared.RecordBatchDef batchDef = batch.getDef();
    +
    +      try {
    +        /* Write the metadata to the file */
    +        batchDef.writeDelimitedTo(output);
    +
    +
    +        /* If we have a selection vector, dump it to file first */
    --- End diff --
    
    In the master branch, the `VectorSerializer` classes were slapped together 
as a simpler API on top of the existing `VectorAccessibleSerializable` classes. 
(That's why the code is so clunky.)
    
    Here, we turn the `VectorSerializer` into the means to do the writing, 
which seems like a good improvement.
    
    Does this mean we now have two implementations? The one here and the old 
one in `VectorAccessibleSerializable`? Do we need to clean up that split 
somehow? Else, should we leave comments to point people between the two copies 
so they know to make changes in both places?
    
    Looking at the other changes, it seems we leave the old "unmanaged" 
external sort to use the old implementation.


> Avoid memory copy from direct buffer to heap while spilling to local disk
> -------------------------------------------------------------------------
>
>                 Key: DRILL-6002
>                 URL: https://issues.apache.org/jira/browse/DRILL-6002
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Vlad Rozov
>            Assignee: Vlad Rozov
>
> When spilling to a local disk or to any file system that supports 
> WritableByteChannel it is preferable to avoid copy from off-heap to java heap 
> as WritableByteChannel can work directly with the off-heap memory.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to