[ 
https://issues.apache.org/jira/browse/HBASE-21879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack updated HBASE-21879:
----------------------------------
    Release Note: 
Before this issue, we've made the read path 100% offheap when block is in the 
BucketCache but if a cache miss, then the RS needs to read the block via an 
on-heap API which would causes high young-GC pressure.

This issue adds reading the block via offheap even if reading the block from 
filesystem directly.  It requires hadoop version(>=2.9.3) but can also work 
with older hadoop versions (all works but we continue to read block onheap). We 
have written a careful doc about the implementation, performance and practice 
here: 
https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E/edit#heading=h.nch5d72p27ex

  was:
Before this issue, we've made the read path 100% offheap when block hit the 
BucketCache 100%, but if the cache missed then RS need to read the block by 
on-heap API, which would cause high young GC pressure.
This issue will read the block by offheap even if reading the block from 
filesystem directly, it have some requirement for hadoop version(>=2.9.3) but 
can also works with older hadoop version(means still works fine but will read 
block onheap). We have written a careful doc about the implementation, 
performance and practice here: 
https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E/edit#heading=h.nch5d72p27ex,
 for more details please read it.


> Read HFile's block to ByteBuffer directly instead of to byte for reducing 
> young gc purpose
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21879
>                 URL: https://issues.apache.org/jira/browse/HBASE-21879
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.0
>
>         Attachments: HBASE-21879.v1.patch, HBASE-21879.v1.patch, 
> QPS-latencies-before-HBASE-21879.png, gc-data-before-HBASE-21879.png
>
>
> In HFileBlock#readBlockDataInternal,  we have the following: 
> {code}
> @VisibleForTesting
> protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset,
>     long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, 
> boolean updateMetrics)
>  throws IOException {
>  // .....
>   // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with 
> BBPool (offheap).
>   byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize];
>   int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize,
>       onDiskSizeWithHeader - preReadHeaderSize, true, offset + 
> preReadHeaderSize, pread);
>   if (headerBuf != null) {
>         // ...
>   }
>   // ...
>  }
> {code}
> In the read path,  we still read the block from hfile to on-heap byte[], then 
> copy the on-heap byte[] to offheap bucket cache asynchronously,  and in my  
> 100% get performance test, I also observed some frequent young gc,  The 
> largest memory footprint in the young gen should be the on-heap block byte[].
> In fact, we can read HFile's block to ByteBuffer directly instead of to 
> byte[] for reducing young gc purpose. we did not implement this before, 
> because no ByteBuffer reading interface in the older HDFS client, but 2.7+ 
> has supported this now,  so we can fix this now. I think. 
> Will provide an patch and some perf-comparison for this. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to