[ https://issues.apache.org/jira/browse/HBASE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans updated HBASE-3789: -------------------------------------- Attachment: HBASE-3789-v3-0.90.patch With the previous patch all the tests passed except for hbck. Looking deeper, I see hbck creates it's own znodes so now the master doesn't see that. It's not clear to my why it's not using HBA.assign instead of the trickery with the HBCK_CODE_NAME. This patch modifies hbck so that it uses "normal" tools provided by the master instead of bypassing it. I'm also working on porting that to trunk. I got the previous patch I posted working but didn't do the hbck stuff yet because it's different. Also I still didn't touch the splitting code in trunk. > Cleanup the locking contention in the master > -------------------------------------------- > > Key: HBASE-3789 > URL: https://issues.apache.org/jira/browse/HBASE-3789 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.2 > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Priority: Blocker > Fix For: 0.92.0 > > Attachments: HBASE-3789-v2-0.90.patch, HBASE-3789-v3-0.90.patch, > HBASE-3789.patch > > > The new master uses a lot of synchronized blocks to be safe, but it only > takes a few jstacks to see that there's multiple layers of lock contention > when a bunch of regions are moving (like when the balancer runs). The main > culprits are regionInTransition in AssignmentManager, ZKAssign that uses > ZKW.getZNnodes (basically another set of region in transitions), and locking > at the RegionState level. > My understanding is that even tho we have multiple threads to handle regions > in transition, everything is actually serialized. Most of the time, lock > holders are talking to ZK or a region server, which can take a few > milliseconds. > A simple example is when AssignmentManager wants to update the timers for all > the regions on a RS, it will usually be waiting on another thread that's > holding the lock while talking to ZK. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira