[ https://issues.apache.org/jira/browse/HBASE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569798#comment-13569798 ]
Himanshu Vashishtha commented on HBASE-7607: -------------------------------------------- Interestingly, with this patch, the regionserver which is aborted is processed normally. And, the test passes its normal phase. Its in the cluster shutdown process, sometimes master is not able to process the other regionserver dying process, but the cluster is considered as shutdown by JVMClusterUtil. {code} 2013-01-30 19:40:14,048 INFO [RegionServer:0;localhost,49074,1359600001555] regionserver.HRegionServer(851): stopping server localhost,49074,1359600001555; zookeeper connection closed. 2013-01-30 19:40:14,048 INFO [RegionServer:0;localhost,49074,1359600001555] regionserver.HRegionServer(854): RegionServer:0;localhost,49074,1359600001555 exiting 2013-01-30 19:40:14,048 INFO [localhost,35387,1359600001393.timerUpdater] hbase.Chore(80): localhost,35387,1359600001393.timerUpdater exiting 2013-01-30 19:40:14,048 INFO [Shutdown of org.apache.hadoop.hbase.fs.HFileSystem@32d35f5f] hbase.MiniHBaseCluster$SingleFileSystemShutdownThread(182): Hook closing fs=org.apache.hadoop.hbase.fs.HFileSystem@32d35f5f 2013-01-30 19:40:14,049 INFO [main] util.JVMClusterUtil(262): Shutdown of 1 master(s) and 2 regionserver(s) complete {code} {code} 2013-01-30 19:40:14,168 INFO [RegionServer:0;localhost,49074,1359600001555.leaseChecker] regionserver.Leases(132): RegionServer:0;localhost,49074,1359600001555.leaseChecker closed leases 2013-01-30 19:40:14,227 INFO [Master:0;localhost,35387,1359600001393] master.ServerManager(357): Waiting on regionserver(s) to go down localhost,49074,1359600001555 {code} But, master thread still looping in its ServerManager#letRegionServersShutdown method to process the dead regionserver, which it doesn't get. I am looking into the reason why this happens only with this patch (frequently is around 1/5). > Fix TestRegionServerCoprocessorExceptionWithAbort flakiness in 0.94 > ------------------------------------------------------------------- > > Key: HBASE-7607 > URL: https://issues.apache.org/jira/browse/HBASE-7607 > Project: HBase > Issue Type: Bug > Components: Client, test > Affects Versions: 0.94.4 > Reporter: Himanshu Vashishtha > Assignee: Himanshu Vashishtha > Fix For: 0.94.6 > > Attachments: HBASE-7607-v2.patch > > > TestRegionServerCoprocessorExceptionWithAbort fails sometimes both on trunk > and 0.94.X. The codebase is different in both. > In 0.94.x, client retries to look at the root region, while the cluster is > down and /hbase znode is no longer present. > "Check the value configured in 'zookeeper.znode.parent'. There could be a > mismatch with the one configured in the master." > I will file a separate jira for the trunk as the code is different there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira