[ https://issues.apache.org/jira/browse/HBASE-12072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Enis Soztutar updated HBASE-12072: ---------------------------------- Attachment: hbase-12072_v3.patch Rebased patch. Lets see hadoopqa. > We are doing 35 x 35 retries for master operations > -------------------------------------------------- > > Key: HBASE-12072 > URL: https://issues.apache.org/jira/browse/HBASE-12072 > Project: HBase > Issue Type: Bug > Affects Versions: 0.98.6 > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 2.0.0, 0.99.2 > > Attachments: 12072-v1.txt, 12072-v2.txt, hbase-12072_v1.patch, > hbase-12072_v2.patch, hbase-12072_v2.patch, hbase-12072_v3.patch > > > For master requests, there are two retry mechanisms in effect. The first one > is from HBaseAdmin.executeCallable() > {code} > private <V> V executeCallable(MasterCallable<V> callable) throws > IOException { > RpcRetryingCaller<V> caller = rpcCallerFactory.newCaller(); > try { > return caller.callWithRetries(callable); > } finally { > callable.close(); > } > } > {code} > And inside, the other one is from StubMaker.makeStub(): > {code} > /** > * Create a stub against the master. Retry if necessary. > * @return A stub to do <code>intf</code> against the master > * @throws MasterNotRunningException > */ > @edu.umd.cs.findbugs.annotations.SuppressWarnings > (value="SWL_SLEEP_WITH_LOCK_HELD") > Object makeStub() throws MasterNotRunningException { > {code} > The tests will just hang for 10 min * 35 ~= 6hours. > {code} > 2014-09-23 16:19:05,151 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 1 of 35 > failed; retrying after sleep of 100, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:19:05,253 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 2 of 35 > failed; retrying after sleep of 200, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:19:05,456 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 3 of 35 > failed; retrying after sleep of 300, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:19:05,759 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 4 of 35 > failed; retrying after sleep of 500, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:19:06,262 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 5 of 35 > failed; retrying after sleep of 1008, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:19:07,273 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 6 of 35 > failed; retrying after sleep of 2011, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:19:09,286 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 7 of 35 > failed; retrying after sleep of 4012, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:19:13,303 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 8 of 35 > failed; retrying after sleep of 10033, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:19:23,343 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 9 of 35 > failed; retrying after sleep of 10089, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:19:33,439 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 10 of > 35 failed; retrying after sleep of 10027, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:19:43,473 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 11 of > 35 failed; retrying after sleep of 10004, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:19:53,485 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 12 of > 35 failed; retrying after sleep of 20160, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:20:13,656 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 13 of > 35 failed; retrying after sleep of 20006, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:20:33,675 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 14 of > 35 failed; retrying after sleep of 20076, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:20:53,762 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 15 of > 35 failed; retrying after sleep of 20077, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:21:13,852 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 16 of > 35 failed; retrying after sleep of 20103, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:21:33,967 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 17 of > 35 failed; retrying after sleep of 20136, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:21:54,115 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 18 of > 35 failed; retrying after sleep of 20147, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:22:14,274 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 19 of > 35 failed; retrying after sleep of 20131, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:22:34,417 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 20 of > 35 failed; retrying after sleep of 20171, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:22:54,601 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 21 of > 35 failed; retrying after sleep of 20177, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:23:14,790 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 22 of > 35 failed; retrying after sleep of 20193, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:23:34,996 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 23 of > 35 failed; retrying after sleep of 20195, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:23:55,203 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 24 of > 35 failed; retrying after sleep of 20107, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:24:15,322 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 25 of > 35 failed; retrying after sleep of 20186, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:24:35,520 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 26 of > 35 failed; retrying after sleep of 20106, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:24:55,638 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 27 of > 35 failed; retrying after sleep of 20173, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:25:15,824 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 28 of > 35 failed; retrying after sleep of 20136, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:25:35,973 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 29 of > 35 failed; retrying after sleep of 20188, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:25:56,174 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 30 of > 35 failed; retrying after sleep of 20144, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:26:16,330 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 31 of > 35 failed; retrying after sleep of 20106, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:26:36,448 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 32 of > 35 failed; retrying after sleep of 20003, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:26:56,463 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 33 of > 35 failed; retrying after sleep of 20114, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:27:16,590 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 34 of > 35 failed; retrying after sleep of 20154, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:27:36,756 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 35 of > 35 failed; no more retrying. > java.io.IOException: Can't get master address from ZooKeeper; znode data == > null > at > org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:114) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1554) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1599) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1653) > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1860) > at > org.apache.hadoop.hbase.client.HBaseAdmin$MasterCallable.prepare(HBaseAdmin.java:3359) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:122) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:92) > at > org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3386) > at > org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2201) > at > org.apache.hadoop.hbase.DistributedHBaseCluster.getClusterStatus(DistributedHBaseCluster.java:74) > at > org.apache.hadoop.hbase.DistributedHBaseCluster.<init>(DistributedHBaseCluster.java:57) > at > org.apache.hadoop.hbase.IntegrationTestingUtility.createDistributedHBaseCluster(IntegrationTestingUtility.java:140) > at > org.apache.hadoop.hbase.IntegrationTestingUtility.initializeCluster(IntegrationTestingUtility.java:75) > at > org.apache.hadoop.hbase.IntegrationTestManyRegions.setUp(IntegrationTestManyRegions.java:80) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runners.Suite.runChild(Suite.java:127) > at org.junit.runners.Suite.runChild(Suite.java:26) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at org.junit.runner.JUnitCore.run(JUnitCore.java:160) > at org.junit.runner.JUnitCore.run(JUnitCore.java:138) > at org.junit.runner.JUnitCore.run(JUnitCore.java:117) > at > org.apache.hadoop.hbase.IntegrationTestsDriver.doWork(IntegrationTestsDriver.java:110) > at > org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:112) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.hbase.IntegrationTestsDriver.main(IntegrationTestsDriver.java:46) > 2014-09-23 16:27:37,061 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 1 of 35 > failed; retrying after sleep of 100, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:27:37,163 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 2 of 35 > failed; retrying after sleep of 200, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:27:37,365 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 3 of 35 > failed; retrying after sleep of 301, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:27:37,669 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 4 of 35 > failed; retrying after sleep of 504, exception=java.io.IOException: Can't get > master address from ZooKeeper; znode data == null > 2014-09-23 16:27:38,176 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 5 of 35 > failed; retrying after sleep of 1008, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:27:39,185 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 6 of 35 > failed; retrying after sleep of 2018, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:27:41,207 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 7 of 35 > failed; retrying after sleep of 4019, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:27:45,231 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 8 of 35 > failed; retrying after sleep of 10004, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:27:55,241 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 9 of 35 > failed; retrying after sleep of 10005, exception=java.io.IOException: Can't > get master address from ZooKeeper; znode data == null > 2014-09-23 16:28:05,253 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 10 of > 35 failed; retrying after sleep of 10099, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:28:15,359 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 11 of > 35 failed; retrying after sleep of 10059, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:28:25,425 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 12 of > 35 failed; retrying after sleep of 20069, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:28:45,507 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 13 of > 35 failed; retrying after sleep of 20006, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:29:05,525 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 14 of > 35 failed; retrying after sleep of 20186, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:29:25,723 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 15 of > 35 failed; retrying after sleep of 20080, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:29:45,814 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 16 of > 35 failed; retrying after sleep of 20001, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:30:05,826 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 17 of > 35 failed; retrying after sleep of 20019, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:30:25,857 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 18 of > 35 failed; retrying after sleep of 20159, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:30:46,028 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 19 of > 35 failed; retrying after sleep of 20170, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:31:06,211 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 20 of > 35 failed; retrying after sleep of 20146, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:31:26,368 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 21 of > 35 failed; retrying after sleep of 20138, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:31:46,518 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 22 of > 35 failed; retrying after sleep of 20140, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:32:06,670 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 23 of > 35 failed; retrying after sleep of 20196, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:32:26,878 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 24 of > 35 failed; retrying after sleep of 20123, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > 2014-09-23 16:32:47,013 INFO [main] > client.ConnectionManager$HConnectionImplementation: getMaster attempt 25 of > 35 failed; retrying after sleep of 20033, exception=java.io.IOException: > Can't get master address from ZooKeeper; znode data == null > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)