[ 
https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171765#comment-13171765
 ] 

Jonathan Hsieh commented on HBASE-5063:
---------------------------------------

Here's the exception -- unfortunately it doesn't say which master it is unable 
to connect to.

{code}
11/12/17 18:50:24 WARN regionserver.HRegionServer: Unable to connect to master. 
Retrying. Error was:
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1024)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:876)
        at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
        at $Proxy8.getProtocolVersion(Unknown Source)
        at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
        at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
        at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
        at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
        at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1616)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:787)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:674)
        at java.lang.Thread.run(Thread.java:619)
{code}
                
> RegionServers fail to report to backup HMaster after primary goes down.
> -----------------------------------------------------------------------
>
>                 Key: HBASE-5063
>                 URL: https://issues.apache.org/jira/browse/HBASE-5063
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>            Priority: Critical
>         Attachments: HBASE-5063.patch
>
>
> # Setup cluster with two HMasters
> # Observe that HM1 is up and that all RS's are in the RegionServer list on 
> web page.
> # Kill (not even -9) the active HMaster
> # Wait for ZK to time out (default 3 minutes).
> # Observe that HM2 is now active.  Tables may show up but RegionServers never 
> report on web page.  Existing connections are fine.  New connections cannot 
> find regionservers.
> Note: 
> * If we replace a new HM1 in the same place and kill HM2, the cluster 
> functions normally again after recovery.  This sees to indicate that 
> regionservers are stuck trying to talk to the old HM1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to