[ https://issues.apache.org/jira/browse/HBASE-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080088#comment-13080088 ]
Anirudh Todi commented on HBASE-4168: ------------------------------------- When the experiment described in the Description was failing, I inspected the logs of the master. The master had finished splitting the logs. It opened region .META. on a new regionserver. It said that it detected completed assignment of META and that it was notifying the catalog tracker and then threw a NPE while processing event M_META_SERVER_SHUTDOWN. Below is the error stack that it gave: java.lang.NullPointerException at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRegionLocation(CatalogTracker.java:434) at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:271) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:323) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:363) at org.apache.hadoop.hbase.catalog.MetaReader.getServerUserRegions(MetaReader.java:566) at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:125) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:151) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) > A client continues to try and connect to a powered down regionserver > -------------------------------------------------------------------- > > Key: HBASE-4168 > URL: https://issues.apache.org/jira/browse/HBASE-4168 > Project: HBase > Issue Type: Bug > Reporter: Anirudh Todi > Assignee: Anirudh Todi > Priority: Minor > > Experiment-1 > Started a dev cluster - META is on the same regionserver as my key-value. I > kill the regionserver process but donot power down the machine. > The META is able to migrate to a new regionserver and the regions are also > able to reopen elsewhere. > The client is able to talk to the META and find the new kv location and get > it. > Experiment-2 > Started a dev cluster - META is on a different regionserver as my key-value. > I kill the regionserver process but donot power down the machine. > The META remains where it is and the regions are also able to reopen > elsewhere. > The client is able to talk to the META and find the new kv location and get > it. > Experiment-3 > Started a dev cluster - META is on a different regionserver as my key-value. > I power down the machine hosting this regionserver. > The META remains where it is and the regions are also able to reopen > elsewhere. > The client is able to talk to the META and find the new kv location and get > it. > Experiment-4 (This is the problematic one) > Started a dev cluster - META is on the same regionserver as my key-value. I > power down the machine hosting this regionserver. > The META is able to migrate to a new regionserver - however - it takes a > really long time (~30 minutes) > The regions on that regionserver DONOT reopen (I waited for 1 hour) > The client is able to find the new location of the META, however, the META > keeps redirecting the client to powered down > regionserver as the location of the key-value it is trying to get. Thus the > client's get is unsuccessful. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira