[ https://issues.apache.org/jira/browse/HBASE-7502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545505#comment-13545505 ]
Ted Yu edited comment on HBASE-7502 at 1/6/13 10:25 PM: -------------------------------------------------------- Nice finding, Himanshu. Some minor comments: {code} + public List<RegionServerThread> getLiveRegionServers() { + return hbaseCluster.getLiveRegionServerThreads(); {code} Since List of RegionServerThread is returned, I think the new method should be named getLiveRegionServerThreads() {code} + regionServerAlive = true; + Thread.sleep(100); {code} The sleep() is used to wait for rs to be out of live region server list. I think 10 ms can be used instead of 100ms. was (Author: yuzhih...@gmail.com): Nice finding, Himanshu. Some minor comments: {code} + public List<RegionServerThread> getLiveRegionServers() { + return hbaseCluster.getLiveRegionServerThreads(); {code} Since List of RegionServerThread is returned, I think the new method should be named getLiveRegionServerThreads() {code} + regionServerAlive = true; + Thread.sleep(100); {code} Why is the sleep() needed ? > TestScannerTimeout fails on snapshot branch > ------------------------------------------- > > Key: HBASE-7502 > URL: https://issues.apache.org/jira/browse/HBASE-7502 > Project: HBase > Issue Type: Bug > Components: test > Affects Versions: hbase-7290 > Reporter: Himanshu Vashishtha > Assignee: Himanshu Vashishtha > Fix For: hbase-7290 > > Attachments: HBASE-7502-v1.patch > > > TestScannerTimeout#test3686a fails consistently on snapshot branch. This is > because there is an increase in the number of watches on the rs znode and its > deletion takes more time now. The repercussion is that when test3686a starts, > it ensures that there are two regionservers and it counts the aborted > regionserver as a live one. While processing, it kills one of its server, and > also the znode of the previously aborted server expires. Overall effect is > there are no regionservers now, and client hangs. > {code} > Error Message > test timed out after 300000 milliseconds > Stacktrace > java.lang.Exception: test timed out after 300000 milliseconds > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.close(HConnectionManager.java:1769) > at org.apache.hadoop.hbase.client.HTable.close(HTable.java:961) > at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:180) > at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:54) > at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:133) > at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) > at > org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:360) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira