Jeongdae Kim created HBASE-21418:
------------------------------------
Summary: Reduce a number of reseek operations in MemstoreScanner
when seek point is close to the current row.
Key: HBASE-21418
URL: https://issues.apache.org/jira/browse/HBASE-21418
Project: HBase
Issue Type: Improvement
Components: scan, Scanners
Affects Versions: 1.2.5
Reporter: Jeongdae Kim
Assignee: Jeongdae Kim
We observed “responseTooSlow” logs for Get requests in our production clusters.
even some get requests were responded after 10 seconds.
Affected get requests were done with the timerange, and target rows have many
columns that have some versions.
We reproduced this issue, and found this behavior happens only when scanning in
the memstore. after flushing the HStore, this slow response issue for Get
disappeared and all same get requests are responded very quickly.
We investigated this case, and found this performance difference between
memstore scanner and hfile scanner is caused by the number of reseek operations
executed while scanning. When a store scanner needs to reseek the next column,
Hfile scanner wisely decide whether it have to reseek or not by checking the
seek point is in current block, whereas memstore scanner just do reseek without
decision unlike Hfile scanner. In our case, almost all columns in the memstore
have older timestamp than scan(get)’s timerange, and so many reseek operations
occur as much as about the number of columns. This results in increasing the
response time of Get requests sporadically.
To improve the reseek operation of the memstore scanner, i think it’s better
skipping than seeking when reseek requested, if seek point is quite close to
current cell that the scanner is pointing now.(Actually, i changed
MatchCode.SEEK_NEXT_COL to MatchCode.Skip in our case, and the response time of
Get was 6x faster than before) But we can’t decide whether seek point is close
to the current cell or not, because memstore scannner has no information such
as next block index.
Before HBASE-13109, Scan.HINT_LOOKAHEAD was introduced to handle like this
case, and it may be deprecated someday. But, i think that hint is still be
useful for the memstore scanner to try to skip first, before reseeking, and
with this option we can make reseek operations of memstore scanner smarter.
I tested this patch in our case, and got the same result as i changed matchcode
(mentioned above).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)