[ https://issues.apache.org/jira/browse/HBASE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100645#comment-13100645 ]
Hudson commented on HBASE-4015: ------------------------------- Integrated in HBase-TRUNK #2189 (See [https://builds.apache.org/job/HBase-TRUNK/2189/]) HBASE-4015 Refactor the TimeoutMonitor to make it less racy HBASE-4015 Refactor the TimeoutMonitor to make it less racy HBASE-4015 Refactor the TimeoutMonitor to make it less racy stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/TimeOutManagerCallable.java stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/TimeOutManagerCallable.java stack : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenMetaHandler.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRootHandler.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java > Refactor the TimeoutMonitor to make it less racy > ------------------------------------------------ > > Key: HBASE-4015 > URL: https://issues.apache.org/jira/browse/HBASE-4015 > Project: HBase > Issue Type: Sub-task > Affects Versions: 0.90.3 > Reporter: Jean-Daniel Cryans > Assignee: ramkrishna.s.vasudevan > Priority: Blocker > Fix For: 0.92.0 > > Attachments: HBASE-4015_1_trunk.patch, HBASE-4015_2_trunk.patch, > HBASE-4015_reprepared_trunk_2.patch, Timeoutmonitor with state diagrams.pdf > > > The current implementation of the TimeoutMonitor acts like a race condition > generator, mostly making things worse rather than better. It does it's own > thing for a while without caring for what's happening in the rest of the > master. > The first thing that needs to happen is that the regions should not be > processed in one big batch, because that sometimes can take minutes to > process (meanwhile a region that timed out opening might have opened, then > what happens is it will be reassigned by the TimeoutMonitor generating the > never ending PENDING_OPEN situation). > Those operations should also be done more atomically, although I'm not sure > how to do it in a scalable way in this case. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira