[ https://issues.apache.org/jira/browse/HBASE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans updated HBASE-3789: -------------------------------------- Attachment: (was: HBASE-3789-trunk-wip.patch) > Cleanup the locking contention in the master > -------------------------------------------- > > Key: HBASE-3789 > URL: https://issues.apache.org/jira/browse/HBASE-3789 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.2 > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Priority: Blocker > Fix For: 0.92.0 > > Attachments: HBASE-3789-trunk.patch, HBASE-3789-v3-0.90.patch > > > The new master uses a lot of synchronized blocks to be safe, but it only > takes a few jstacks to see that there's multiple layers of lock contention > when a bunch of regions are moving (like when the balancer runs). The main > culprits are regionInTransition in AssignmentManager, ZKAssign that uses > ZKW.getZNnodes (basically another set of region in transitions), and locking > at the RegionState level. > My understanding is that even tho we have multiple threads to handle regions > in transition, everything is actually serialized. Most of the time, lock > holders are talking to ZK or a region server, which can take a few > milliseconds. > A simple example is when AssignmentManager wants to update the timers for all > the regions on a RS, it will usually be waiting on another thread that's > holding the lock while talking to ZK. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira