[ https://issues.apache.org/jira/browse/HBASE-6958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13470874#comment-13470874 ]
Ted Yu commented on HBASE-6958: ------------------------------- Making timeout longer, I found that there was trouble scanning .META. Here is jstack: {code} "RunAmJoinCluster" prio=5 tid=0x00007ffc32041000 nid=0x6d03 waiting on condition [0x0000000113fd6000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:194) at org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:371) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:218) at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:127) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:668) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:567) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:181) at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:141) at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:2163) at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:320) at org.apache.hadoop.hbase.master.TestAssignmentManager$2.run(TestAssignmentManager.java:1087) {code} > TestAssignmentManager fails in trunk > ------------------------------------ > > Key: HBASE-6958 > URL: https://issues.apache.org/jira/browse/HBASE-6958 > Project: HBase > Issue Type: Bug > Reporter: Ted Yu > Fix For: 0.96.0 > > > From > https://builds.apache.org/job/HBase-TRUNK/3432/testReport/junit/org.apache.hadoop.hbase.master/TestAssignmentManager/testBalanceOnMasterFailoverScenarioWithOpenedNode/ > : > {code} > Stacktrace > java.lang.Exception: test timed out after 5000 milliseconds > at java.lang.System.arraycopy(Native Method) > at java.lang.ThreadGroup.remove(ThreadGroup.java:969) > at java.lang.ThreadGroup.threadTerminated(ThreadGroup.java:942) > at java.lang.Thread.exit(Thread.java:732) > ... > 2012-10-06 00:46:12,521 DEBUG [MASTER_CLOSE_REGION-mockedAMExecutor-0] > zookeeper.ZKUtil(1141): mockedServer-0x13a33892de7000e Retrieved 81 byte(s) > of data from znode /hbase/unassigned/dc01abf9cd7fd0ea256af4df02811640 and set > watcher; region=t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640., > state=M_ZK_REGION_OFFLINE, servername=master,1,1, createTime=1349484372509, > payload.length=0 > 2012-10-06 00:46:12,522 ERROR [MASTER_CLOSE_REGION-mockedAMExecutor-0] > executor.EventHandler(205): Caught throwable while processing event > RS_ZK_REGION_CLOSED > java.lang.NullPointerException > at > org.apache.hadoop.hbase.master.TestAssignmentManager$MockedLoadBalancer.randomAssignment(TestAssignmentManager.java:773) > at > org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1709) > at > org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:1666) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1435) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1155) > at > org.apache.hadoop.hbase.master.TestAssignmentManager$AssignmentManagerWithExtrasForTesting.assign(TestAssignmentManager.java:1035) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1130) > at > org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1125) > at > org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:106) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > 2012-10-06 00:46:12,522 DEBUG [pool-1-thread-1-EventThread] > master.AssignmentManager(670): Handling transition=M_ZK_REGION_OFFLINE, > server=master,1,1, region=dc01abf9cd7fd0ea256af4df02811640, current state > from region state map ={t,,1349484359011.dc01abf9cd7fd0ea256af4df02811640. > state=OFFLINE, ts=1349484372508, server=null} > {code} > Looks like NPE happened on this line: > {code} > this.gate.set(true); > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira