[ 
https://issues.apache.org/jira/browse/HBASE-15077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088782#comment-15088782
 ] 

ramkrishna.s.vasudevan commented on HBASE-15077:
------------------------------------------------

bq.ByteBufferSupportedDataOutputStream may not be backed by BB it might be 
backed by array only. But it supports the BB write APIs. Am I making a point?
I went thro the patch and the existing code. So previously the code was 
retrieving byte by byte and that was getting written to the stream and now that 
we avoid and allow copy to happen as a whole byte[] instead of one by one ?
But still the contents are brought onheap correct?
bq.userDataStream = new ByteBufferSupportedDataOutputStream(baosInMemory);
This is still a backed by an ByteArrayOutputStream.
{code}
   BufferGrabbingByteArrayOutputStream baosInMemoryCopy =               
973                 new BufferGrabbingByteArrayOutputStream();          
974             baosInMemory.writeTo(baosInMemoryCopy);
{code}
This got removed because the new ByteArrayOutputStream has getBuffer?  
In case of non DBE block-  when all the cells are offheap, can we not create a 
BBStream and write the BB underlying the BBstream directly to the 
FSOutputStream? May be that is a future HDFS work - when FSOutputStream has a 
write() accepting BB as a param and call that?


> Support OffheapKV write in compaction with out copying data on heap
> -------------------------------------------------------------------
>
>                 Key: HBASE-15077
>                 URL: https://issues.apache.org/jira/browse/HBASE-15077
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15077.patch
>
>
> HBASE-14832  is not enough to handle this.  Doing the remaining needed here.
> {code}
>  if (cell instanceof ByteBufferedCell) {
> 890         out.writeShort(rowLen);
> 891         ByteBufferUtils.copyBufferToStream(out, ((ByteBufferedCell) 
> cell).getRowByteBuffer(),
> 892           ((ByteBufferedCell) cell).getRowPosition(), rowLen);
> 893         out.writeByte(fLen);
> 894         ByteBufferUtils.copyBufferToStream(out, ((ByteBufferedCell) 
> cell).getFamilyByteBuffer(),
> 895           ((ByteBufferedCell) cell).getFamilyPosition(), fLen);
> 896         ByteBufferUtils.copyBufferToStream(out, ((ByteBufferedCell) 
> cell).getQualifierByteBuffer(),
> 897           ((ByteBufferedCell) cell).getQualifierPosition(), qLen);
> {code}
> We have done this but it is not really helping us!  
> In ByteBufferUtils#copyBufferToStream
> {code}
> public static void copyBufferToStream(OutputStream out, ByteBuffer in,
>       int offset, int length) throws IOException {
>     if (in.hasArray()) {
>       out.write(in.array(), in.arrayOffset() + offset,
>           length);
>     } else {
>       for (int i = 0; i < length; ++i) {
>         out.write(toByte(in, offset + i));
>       }
>     }
>   }
>   {code}
> So for DBB it is so costly op writing byte by byte reading each to on heap.
> Even if we use writeByteBuffer(OutputStream out, ByteBuffer b, int offset, 
> int length), it won't help us as the underlying stream is a 
> ByteArrayOutputStream and so we will end up in copying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to