[ https://issues.apache.org/jira/browse/HBASE-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542356#comment-13542356 ]
Jimmy Xiang commented on HBASE-7407: ------------------------------------ Instead of calling sendRegionClose/sendRegionOpen directly, I was thinking should we call unassign/open instead which handles more error cases, such as ServerNotRunningYetException in case the regionserver is online but not ready for RPC? Another thing is that RegionStates@regionOnline should be used to online a region when we revert to original state if sendRegionClose returns false. #updateRegionState will keep the region in transition. > TestMasterFailover under tests some cases and over tests some others > -------------------------------------------------------------------- > > Key: HBASE-7407 > URL: https://issues.apache.org/jira/browse/HBASE-7407 > Project: HBase > Issue Type: Bug > Components: master, Region Assignment, test > Affects Versions: 0.96.0 > Reporter: nkeywal > Assignee: nkeywal > Priority: Minor > Attachments: 7407.v1.patch, 7407.v2.patch > > > The tests are done with this settings: > conf.setInt("hbase.master.assignment.timeoutmonitor.period", 2000); > conf.setInt("hbase.master.assignment.timeoutmonitor.timeout", 4000); > As a results: > 1) some tests seems to work, but in real life, the recovery would take 5 > minutes or more, as in production there always higher. So we don't see the > real issues. > 2) The tests include specific cases that should not happen in production. It > works because the timeout catches everything, but these scenarios do not need > to be optimized, as they cannot happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira