[ https://issues.apache.org/jira/browse/HBASE-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123569#comment-13123569 ]
jirapos...@reviews.apache.org commented on HBASE-4540: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2251/#review2469 ----------------------------------------------------------- Ship it! I'm good on commit. Have some suggestions for future handler tests below. I'm ok if we commit w/o addressing them here. Nice fix Ram http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java <https://reviews.apache.org/r/2251/#comment5578> We don't have this method already in our ZK* classes? http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java <https://reviews.apache.org/r/2251/#comment5579> Do you have to spin up the cluster twice? Could you do it once only in @BeforeClass and then shut it down in @AfterClass? So its run once only? http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java <https://reviews.apache.org/r/2251/#comment5580> Good test. Would it be possible to test the handler without spinning up the cluster? See TestOpenRegionHandler over under regionserver.handler in tests -- they don't spin up a cluster, just zk. Test can run faster if no dfs+hbase. Not important. For the future. - Michael On 2011-10-08 05:13:32, ramkrishna vasudevan wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2251/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-10-08 05:13:32) bq. bq. bq. Review request for hbase, Ted Yu, Michael Stack, and Jonathan Gray. bq. bq. bq. Summary bq. ------- bq. bq. Fix for handling HBASE-4539 and HBASE-4540. bq. Ran all the testcases. Added one new testcase to verify OpenedRegionHandler scenarios. bq. Also addresses Ted's comments. bq. bq. bq. This addresses bug HBASE-4540. bq. https://issues.apache.org/jira/browse/HBASE-4540 bq. bq. bq. Diffs bq. ----- bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 1179945 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java 1179945 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1179945 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 1179945 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestOpenedRegionHandler.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/2251/diff bq. bq. bq. Testing bq. ------- bq. bq. Yes bq. bq. bq. Thanks, bq. bq. ramkrishna bq. bq. > OpenedRegionHandler is not enforcing atomicity of the operation it is > performing > -------------------------------------------------------------------------------- > > Key: HBASE-4540 > URL: https://issues.apache.org/jira/browse/HBASE-4540 > Project: HBase > Issue Type: Bug > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Attachments: HBASE-4540_1.patch > > > -> OpenedRegionHandler has not yet deleted the znode of the region R1 opened > by RS1. > -> RS1 goes down. > -> Servershutdownhandler assigns the region R1 to RS2. > -> The znode of R1 is moved to OFFLINE state by master or OPENING state by > RS2 if RS2 has started opening the region. > -> Now the first OpenedRegionHandler tries to delete the znode thinking its > in OPENED state but fails. > -> Though it fails it removes the node from RIT and adds RS1 as the owner of > R1 in master's memory. > -> Now when RS2 completes opening the region the master is not able to open > the region as already the reigon has been deleted from RIT. > {code} > Master > ====== > 2011-10-05 20:49:45,301 INFO > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Finished > processing of shutdown of linux146,60020,1317827727647 > 2011-10-05 20:49:54,177 DEBUG org.apache.hadoop.hbase.master.HMaster: Not > running balancer because 1 region(s) in transition: > {3e69d628a8bd8e9b7c5e7a2a6e03aad9=t1,,1317827883842.3e69d628a8bd8e9b7c5e7a2a6e03aad9. > state=PENDING_OPEN, ts=1317827985272, server=linux76,60020,1317827746847} > 2011-10-05 20:49:57,720 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=M_ZK_REGION_OFFLINE, server=linux76,60000,1317827742012, > region=3e69d628a8bd8e9b7c5e7a2a6e03aad9 > 2011-10-05 20:50:14,501 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:60000-0x132d3dc13090023 Deleting existing unassigned node for > 3e69d628a8bd8e9b7c5e7a2a6e03aad9 that is in expected state RS_ZK_REGION_OPENED > 2011-10-05 20:50:14,505 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:60000-0x132d3dc13090023 Attempting to delete unassigned node > 3e69d628a8bd8e9b7c5e7a2a6e03aad9 in RS_ZK_REGION_OPENED state but node is in > RS_ZK_REGION_OPENING state > After the region is opened in RS2 > ================================= > 2011-10-05 20:50:48,066 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, > region=3e69d628a8bd8e9b7c5e7a2a6e03aad9, which is more than 15 seconds late > 2011-10-05 20:50:48,290 WARN > org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region > 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but > region was in the state null and not in expected PENDING_OPEN or OPENING > states > 2011-10-05 20:50:53,743 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=linux76,60020,1317827746847, > region=3e69d628a8bd8e9b7c5e7a2a6e03aad9 > 2011-10-05 20:50:54,182 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: > Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s) > 2011-10-05 20:50:54,397 WARN > org.apache.hadoop.hbase.master.AssignmentManager: Received OPENING for region > 3e69d628a8bd8e9b7c5e7a2a6e03aad9 from server linux76,60020,1317827746847 but > region was in the state null and not in expected PENDING_OPEN or OPENING > states > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira