[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334488#comment-14334488
 ] 

stack commented on HBASE-13082:
-------------------------------

bq. So that would be 20 minutes during which we would not be able to commit a 
flush or finish a compaction.

1. In above case, how many column families, and if > 1, how much of the 
20minutes was spent in each CF. If CF == 1, then there were probably no flushes 
nor compactions going on anyways.  If CF > 1, were there even any 
flushes/compactions going on (were they needed)? I'd argue the patch proposed 
here probably makes the situation no worse when we have a scanner stuck down 
deep inside an HRegion for 20 minutes at a time.
2. Ain't a scanner stuck for 20minutes a different issue altogether than the 
one being solved here?  If a scan disappears for 20 minutes trying to pull out 
a row, can't we do something like the [~jonathan.lawlor] chunking patch only we 
have it time based?  We return a partial -- even if empty -- if scanning for a 
full minute say?

The region was probably really big. In hbase 2.0 we want to move to realm where 
regions are small.  This patch is therefore good for 2.0 anyways?

It would be cool if we could do the lock on a Store-basis, especially given we 
not can flush at the Store level.



> Coarsen StoreScanner locks to RegionScanner
> -------------------------------------------
>
>                 Key: HBASE-13082
>                 URL: https://issues.apache.org/jira/browse/HBASE-13082
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>         Attachments: 13082.txt
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to