[ https://issues.apache.org/jira/browse/HBASE-9602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans updated HBASE-9602: -------------------------------------- Status: Patch Available (was: Open) Heh kinda forgot to hit Submit Patch. bq. When this can happen? If it can happen, we would get a npe. When you try to open the web UI as the master is just starting. FWIW that part namespace code is very prone to NPEs from what I can tell and can easily hang code paths that didn't hang before (because of retries, like this jira). > Cluster can't start when log splitting at startup time and the master's web > UI is refreshed a few times > ------------------------------------------------------------------------------------------------------- > > Key: HBASE-9602 > URL: https://issues.apache.org/jira/browse/HBASE-9602 > Project: HBase > Issue Type: Bug > Affects Versions: 0.96.0 > Reporter: Jean-Daniel Cryans > Assignee: stack > Priority: Critical > Fix For: 0.98.0, 0.96.0 > > Attachments: HBASE-9602.jstack.rtf, HBASE-9602.patch > > > It looks like we cannot show the master's web ui at start time when there are > logs to split because we can't reach the namespace regions. > So it means that you can't see how things are progressing without tailing the > log while waiting on your cluster to boot up. This wasn't the case in 0.94 > See this jstack: > {noformat} > "606214580@qtp-2001431298-3" prio=10 tid=0x00007f6ac8040000 nid=0x7b1 in > Object.wait() [0x00007f6aa82bf000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x00000000bc0c1460> (a > org.apache.hadoop.hbase.ipc.RpcClient$Call) > at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1416) > - locked <0x00000000bc0c1460> (a > org.apache.hadoop.hbase.ipc.RpcClient$Call) > at > org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1634) > at > org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1691) > at > org.apache.hadoop.hbase.protobuf.generated.MasterAdminProtos$MasterAdminService$BlockingStub.listTableDescriptorsByNamespace(MasterAdminProtos.java:35031) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.listTableDescriptorsByNamespace(HConnectionManager.java:2181) > at > org.apache.hadoop.hbase.client.HBaseAdmin$22.call(HBaseAdmin.java:2265) > at > org.apache.hadoop.hbase.client.HBaseAdmin$22.call(HBaseAdmin.java:2262) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116) > - locked <0x00000000c09baf20> (a > org.apache.hadoop.hbase.client.RpcRetryingCaller) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94) > - locked <0x00000000c09baf20> (a > org.apache.hadoop.hbase.client.RpcRetryingCaller) > at > org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3155) > at > org.apache.hadoop.hbase.client.HBaseAdmin.listTableDescriptorsByNamespace(HBaseAdmin.java:2261) > at > org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.__jamon_innerUnit__catalogTables(MasterStatusTmplImpl.java:461) > at > org.apache.hadoop.hbase.tmpl.master.MasterStatusTmplImpl.renderNoFlush(MasterStatusTmplImpl.java:270) > at > org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.renderNoFlush(MasterStatusTmpl.java:382) > at > org.apache.hadoop.hbase.tmpl.master.MasterStatusTmpl.render(MasterStatusTmpl.java:372) > at > org.apache.hadoop.hbase.master.MasterStatusServlet.doGet(MasterStatusServlet.java:95) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) > at > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:850) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > at org.mortbay.jetty.Server.handle(Server.java:326) > at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > at > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > at > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > ... > "RpcServer.handler=28,port=60000" daemon prio=10 tid=0x00007f6ad08a5800 > nid=0x77e waiting on condition [0x00007f6aa97d5000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:148) > - locked <0x00000000c0909178> (a > org.apache.hadoop.hbase.client.RpcRetryingCaller) > at org.apache.hadoop.hbase.client.HTable.get(HTable.java:760) > at > org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:171) > at > org.apache.hadoop.hbase.master.TableNamespaceManager.getNamespaceTable(TableNamespaceManager.java:119) > - locked <0x00000000c1958890> (a > org.apache.hadoop.hbase.master.TableNamespaceManager) > at > org.apache.hadoop.hbase.master.TableNamespaceManager.get(TableNamespaceManager.java:155) > - locked <0x00000000c1958890> (a > org.apache.hadoop.hbase.master.TableNamespaceManager) > at > org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3088) > at > org.apache.hadoop.hbase.master.HMaster.listTableDescriptorsByNamespace(HMaster.java:3102) > at > org.apache.hadoop.hbase.master.HMaster.listTableDescriptorsByNamespace(HMaster.java:3012) > at > org.apache.hadoop.hbase.protobuf.generated.MasterAdminProtos$MasterAdminService$2.callBlockingMethod(MasterAdminProtos.java:32908) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2146) > at > org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1851) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira