[ 
https://issues.apache.org/jira/browse/HBASE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-3789:
--------------------------------------

    Attachment: HBASE-3789-v3-0.90.patch

With the previous patch all the tests passed except for hbck. Looking deeper, I 
see hbck creates it's own znodes so now the master doesn't see that. It's not 
clear to my why it's not using HBA.assign instead of the trickery with the 
HBCK_CODE_NAME.

This patch modifies hbck so that it uses "normal" tools provided by the master 
instead of bypassing it.

I'm also working on porting that to trunk. I got the previous patch I posted 
working but didn't do the hbck stuff yet because it's different.

Also I still didn't touch the splitting code in trunk.

> Cleanup the locking contention in the master
> --------------------------------------------
>
>                 Key: HBASE-3789
>                 URL: https://issues.apache.org/jira/browse/HBASE-3789
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.2
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3789-v2-0.90.patch, HBASE-3789-v3-0.90.patch, 
> HBASE-3789.patch
>
>
> The new master uses a lot of synchronized blocks to be safe, but it only 
> takes a few jstacks to see that there's multiple layers of lock contention 
> when a bunch of regions are moving (like when the balancer runs). The main 
> culprits are regionInTransition in AssignmentManager, ZKAssign that uses 
> ZKW.getZNnodes (basically another set of region in transitions), and locking 
> at the RegionState level. 
> My understanding is that even tho we have multiple threads to handle regions 
> in transition, everything is actually serialized. Most of the time, lock 
> holders are talking to ZK or a region server, which can take a few 
> milliseconds.
> A simple example is when AssignmentManager wants to update the timers for all 
> the regions on a RS, it will usually be waiting on another thread that's 
> holding the lock while talking to ZK.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to