[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14332476#comment-14332476 ]
Lars Hofhansl commented on HBASE-13082: --------------------------------------- So at least we have established that locking in the StoreScanner is bad :) I now remember issues we have seen with timerange range scans, where in unlucky circumstances it takes almost 20 minutes to finish scanning a single region (and that time all spent inside a *single* RegionScanner.next() call, as in this case no Cells matched the timerange) So that would be 20 minutes(!) during which we would not be able to commit a flush or finish a compaction. So now, I do not think that is acceptable. The RegionScanner lock is too coarse. We need something in between. Hmmm.... > Coarsen StoreScanner locks to RegionScanner > ------------------------------------------- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Attachments: 13082.txt > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)