[ https://issues.apache.org/jira/browse/HBASE-21879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766953#comment-16766953 ]
Zheng Hu edited comment on HBASE-21879 at 2/13/19 9:10 AM: ----------------------------------------------------------- IMO, we can read the block into an ByteBuffer firstly for all kinds of IOEngine, and the ByteBuffer will be allocated from BB pool as you said. then the writer thread of BucketCache will persist bytebuffer to their conrressponding engine, finally the hfile block will free its ByteBuffer for reusing. by this way, we can share the same code path. Yeah multiByteBuffer if block > one pooled buffer. Currently, we will pre-allocate 2MB * 2 for each rpc handler, if we put the block in ByteBufferPool, maybe we need a larger byte buffer for each handler. Another problem is: the dfs client provide the DFSInputStream#read(ByteBuffer) interface, while we will allocate a ByteBuff (which is defined in our hbase package) if we use the ByteBufferPool, how should I read the bytes from a ByteBuffer to ByteBuff ? maybe we can wrap an ByteBuff which implement the ByteBuffer interface ? was (Author: openinx): IMO, we can read the block into an ByteBuffer firstly for all kinds of IOEngine, and the ByteBuffer will be allocated from BB pool as you said. then the writer thread of BucketCache will persist bytebuffer to their conrressponding engine, finally the hfile block will free its ByteBuffer for reusing. by this way, we can share the same code path. Yeah multiByteBuffer if block > one pooled buffer. Currently, we will pre-allocate 2MB * 2 for each rpc handler, if we put the block in ByteBufferPool, maybe we need a larger byte buffer for each handler. Another problem is: the dfs client provide the DFSInputStream#read(ByteBuffer) interface, while we will allocate a ByteBuff (which is defined in our hbase package) if we use the ByteBufferPool, how should I read the bytes from a ByteBuffer to ByteBuff ? > Read HFile's block to ByteBuffer directly instead of to byte for reducing > young gc purpose > ------------------------------------------------------------------------------------------ > > Key: HBASE-21879 > URL: https://issues.apache.org/jira/browse/HBASE-21879 > Project: HBase > Issue Type: Improvement > Reporter: Zheng Hu > Assignee: Zheng Hu > Priority: Major > Fix For: 3.0.0, 2.2.0, 2.3.0, 2.1.4 > > Attachments: QPS-latencies-before-HBASE-21879.png, > gc-data-before-HBASE-21879.png > > > In HFileBlock#readBlockDataInternal, we have the following: > {code} > @VisibleForTesting > protected HFileBlock readBlockDataInternal(FSDataInputStream is, long offset, > long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, > boolean updateMetrics) > throws IOException { > // ..... > // TODO: Make this ByteBuffer-based. Will make it easier to go to HDFS with > BBPool (offheap). > byte [] onDiskBlock = new byte[onDiskSizeWithHeader + hdrSize]; > int nextBlockOnDiskSize = readAtOffset(is, onDiskBlock, preReadHeaderSize, > onDiskSizeWithHeader - preReadHeaderSize, true, offset + > preReadHeaderSize, pread); > if (headerBuf != null) { > // ... > } > // ... > } > {code} > In the read path, we still read the block from hfile to on-heap byte[], then > copy the on-heap byte[] to offheap bucket cache asynchronously, and in my > 100% get performance test, I also observed some frequent young gc, The > largest memory footprint in the young gen should be the on-heap block byte[]. > In fact, we can read HFile's block to ByteBuffer directly instead of to > byte[] for reducing young gc purpose. we did not implement this before, > because no ByteBuffer reading interface in the older HDFS client, but 2.7+ > has supported this now, so we can fix this now. I think. > Will provide an patch and some perf-comparison for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)