Sanjeet Malhotra created HBASE-29398:
----------------------------------------

             Summary: Server side scan metrics for bytes read from FS vs Block 
cache vs memstore
                 Key: HBASE-29398
                 URL: https://issues.apache.org/jira/browse/HBASE-29398
             Project: HBase
          Issue Type: Improvement
            Reporter: Sanjeet Malhotra
            Assignee: Sanjeet Malhotra


Currently, HBase doesn't have a metric on the server side which counts how many 
bytes were read from FS vs block cache vs memstore. Reading cells from 
in-memory like block cache or memstore vs from FS can make latencies vary 
drastically.

Separate metrics for bytes scanned from block cache vs memstore are beneficial 
for use cases which immediately read (like within 5 sec) after writing the 
data. There the expectation would be that bytes scanned from FS or block cache 
should be zero unless a flush happened (which can be checked from logs). 

Currently, HBase has a server side scan metric `countOfBlockBytesScanned` which 
aims to capture the block bytes scanned by read request. But there are few gaps 
in the metric:


 * It doesn't account for block bytes scanned as part of 
KeyValueHeap#pollRealKV().
 * It doesn't account for the bytes index block bytes scanned, bloom filter 
bytes scanned.
 * It doesn't differentiate between bytes scanned from block cache vs FS.

The proposal is to add 3 new server side scan metrics, one each for: bytes 
scanned from FS, bytes scanned from block cache and bytes scanned from 
memstore. 

 

Currently, the aim is to just add these 3 new set of metrics and expose them 
via ServerSide scan metrics. Replacing `countOfBlockBytesScanned` by bytes 
scanned from FS and bytes scanned from block cache and integrating the new 
metrics with HBase Quotas code can be taken up separately. 

 

I intend to cherry-pick this change to HBase 3 and HBase 2 (till HBase 2.5).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to