[ 
https://issues.apache.org/jira/browse/HBASE-15716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15716:
--------------------------
    Attachment: Screen Shot 2016-04-27 at 9.49.35 AM.png
                hits.png
                15716.prune.synchronizations.v3.patch

This patch plugs the 'hole' identified in the above scenario (The one where we 
get the mvcc readpoint at p1 in the scanner creation but before we can add 
ourselves to the region scannerReadPoints map, the readpoint moves forward to 
p2; then a call to getSmallestReadpoint comes in, and Cells between p2 and p1 
are purged corrupting our scan 'view')

We plug the hole by doing a check and put and not progressing with the scanner 
creation until we are sure that what is registered in scannerReadPoints is the 
current readpoint. If it is not, we go around until what is in 
scannerReadPoints matches the current state of the mvcc read point.

We are doing two reads of an atomic long (mvcc#getReadPoint) for 
synchronization across the atomic long read and update of the 
scannerReadPoints.put Map.

The difference in the throughput is pretty dramatic: 220k ops/second vs 290k 
ops/second (30%?). See attached hits png. I also include the fr recording which 
shows lock incidence is gone.

Let me check my work by doing a few more runs. [~lhofhansl] what you think of 
the latest patch? Can you find a hole in it?

> HRegion#RegionScannerImpl scannerReadPoints synchronization costs
> -----------------------------------------------------------------
>
>                 Key: HBASE-15716
>                 URL: https://issues.apache.org/jira/browse/HBASE-15716
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: stack
>         Attachments: 15716.prune.synchronizations.patch, 
> 15716.prune.synchronizations.v3.patch, Screen Shot 2016-04-26 at 2.05.45 
> PM.png, Screen Shot 2016-04-26 at 2.06.14 PM.png, Screen Shot 2016-04-26 at 
> 2.07.06 PM.png, Screen Shot 2016-04-26 at 2.25.26 PM.png, Screen Shot 
> 2016-04-26 at 6.02.29 PM.png, Screen Shot 2016-04-27 at 9.49.35 AM.png, 
> hits.png, remove_cslm.patch
>
>
> Here is a [~lhofhansl] special.
> When we construct the region scanner, we get our read point and then store it 
> with the scanner instance in a Region scoped CSLM. This is done under a 
> synchronize on the CSLM.
> This synchronize on a region-scoped Map creating region scanners is the 
> outstanding point of lock contention according to flight recorder (My work 
> load is workload c, random reads).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to