[
https://issues.apache.org/jira/browse/HBASE-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080088#comment-13080088
]
Anirudh Todi commented on HBASE-4168:
-------------------------------------
When the experiment described in the Description was failing, I inspected the
logs of the master.
The master had finished splitting the logs. It opened region .META. on a new
regionserver. It said that it detected completed assignment of META and that it
was notifying the catalog tracker and then threw a NPE while processing event
M_META_SERVER_SHUTDOWN. Below is the error stack that it gave:
java.lang.NullPointerException
at
org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRegionLocation(CatalogTracker.java:434)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:271)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:323)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:363)
at
org.apache.hadoop.hbase.catalog.MetaReader.getServerUserRegions(MetaReader.java:566)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:125)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:151)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
> A client continues to try and connect to a powered down regionserver
> --------------------------------------------------------------------
>
> Key: HBASE-4168
> URL: https://issues.apache.org/jira/browse/HBASE-4168
> Project: HBase
> Issue Type: Bug
> Reporter: Anirudh Todi
> Assignee: Anirudh Todi
> Priority: Minor
>
> Experiment-1
> Started a dev cluster - META is on the same regionserver as my key-value. I
> kill the regionserver process but donot power down the machine.
> The META is able to migrate to a new regionserver and the regions are also
> able to reopen elsewhere.
> The client is able to talk to the META and find the new kv location and get
> it.
> Experiment-2
> Started a dev cluster - META is on a different regionserver as my key-value.
> I kill the regionserver process but donot power down the machine.
> The META remains where it is and the regions are also able to reopen
> elsewhere.
> The client is able to talk to the META and find the new kv location and get
> it.
> Experiment-3
> Started a dev cluster - META is on a different regionserver as my key-value.
> I power down the machine hosting this regionserver.
> The META remains where it is and the regions are also able to reopen
> elsewhere.
> The client is able to talk to the META and find the new kv location and get
> it.
> Experiment-4 (This is the problematic one)
> Started a dev cluster - META is on the same regionserver as my key-value. I
> power down the machine hosting this regionserver.
> The META is able to migrate to a new regionserver - however - it takes a
> really long time (~30 minutes)
> The regions on that regionserver DONOT reopen (I waited for 1 hour)
> The client is able to find the new location of the META, however, the META
> keeps redirecting the client to powered down
> regionserver as the location of the key-value it is trying to get. Thus the
> client's get is unsuccessful.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira