Jeongdae Kim created HBASE-21418:
------------------------------------

             Summary: Reduce a number of reseek operations in MemstoreScanner 
when seek point is close to the current row.
                 Key: HBASE-21418
                 URL: https://issues.apache.org/jira/browse/HBASE-21418
             Project: HBase
          Issue Type: Improvement
          Components: scan, Scanners
    Affects Versions: 1.2.5
            Reporter: Jeongdae Kim
            Assignee: Jeongdae Kim


We observed “responseTooSlow” logs for Get requests in our production clusters. 
even some get requests were responded after 10 seconds.
Affected get requests were done with the timerange, and target rows have many 
columns that have some versions.
We reproduced this issue, and found this behavior happens only when scanning in 
the memstore. after flushing the HStore, this slow response issue for Get 
disappeared and all same get requests are responded very quickly.
 
We investigated this case, and found this performance difference between 
memstore scanner and hfile scanner is caused by the number of reseek operations 
executed while scanning. When a store scanner needs to reseek the next column, 
Hfile scanner wisely decide whether it have to reseek or not by checking the 
seek point is in current block, whereas memstore scanner just do reseek without 
decision unlike Hfile scanner. In our case, almost all columns in the memstore 
have older timestamp than scan(get)’s timerange, and so many reseek operations 
occur as much as about the number of columns. This results in increasing the 
response time of Get requests sporadically.
 
To improve the reseek operation of the memstore scanner, i think it’s better 
skipping than seeking when reseek requested, if seek point is quite close to 
current cell that the scanner is pointing now.(Actually, i changed 
MatchCode.SEEK_NEXT_COL to MatchCode.Skip in our case, and the response time of 
Get was 6x faster than before) But we can’t decide whether seek point is close 
to the current cell or not, because memstore scannner has no information such 
as next block index.
 Before HBASE-13109, Scan.HINT_LOOKAHEAD was introduced to handle like this 
case, and it may be deprecated someday. But, i think that hint is still be 
useful for the memstore scanner to try to skip first, before reseeking, and 
with this option we can make reseek operations of memstore scanner smarter.
 
I tested this patch in our case, and got the same result as i changed matchcode 
(mentioned above).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to