[ 
https://issues.apache.org/jira/browse/HBASE-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-1921:
--------------------------------------

    Attachment: HBASE-1921.patch

Patch that does what I described and here's what you will see when it happens:

{code}2009-10-20 10:53:38,708 DEBUG org.apache.hadoop.hbase.master.HMaster: Got 
event None with path null
2009-10-20 10:53:39,997 INFO org.apache.zookeeper.ClientCnxn: Attempting 
connection to server /10.10.1.58:2181
2009-10-20 10:53:39,998 INFO org.apache.zookeeper.ClientCnxn: Priming 
connection to java.nio.channels.SocketChannel[connected local=/10.10.1.58:56099 
remote=/10.10.1.58:2181]
2009-10-20 10:53:39,998 INFO org.apache.zookeeper.ClientCnxn: Server connection 
successful
2009-10-20 10:53:40,000 WARN org.apache.zookeeper.ClientCnxn: Exception closing 
session 0x12472fd41f10004 to sun.nio.ch.selectionkeyi...@2afb6c5f
java.io.IOException: Session Expired
        at 
org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589)
        at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
2009-10-20 10:53:40,000 DEBUG org.apache.hadoop.hbase.master.HMaster: Got event 
None with path null
2009-10-20 10:53:40,000 INFO org.apache.hadoop.hbase.master.HMaster: Master 
lost its znode, trying to get a new one
2009-10-20 10:53:40,000 INFO org.apache.zookeeper.ZooKeeper: Closing session: 
0x12472fd41f10004
2009-10-20 10:53:40,000 INFO org.apache.zookeeper.ClientCnxn: Closing 
ClientCnxn for session: 0x12472fd41f10004
2009-10-20 10:53:40,001 INFO org.apache.zookeeper.ClientCnxn: Disconnecting 
ClientCnxn for session: 0x12472fd41f10004
2009-10-20 10:53:40,001 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x12472fd41f10004 closed
2009-10-20 10:53:40,001 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Closed connection with 
ZooKeeper
2009-10-20 10:53:40,003 INFO org.apache.zookeeper.ZooKeeper: Initiating client 
connection, connectString=10.10.1.58:2181 sessionTimeout=60000 
watcher=Thread[HMaster,5,main]
2009-10-20 10:53:40,003 INFO org.apache.zookeeper.ClientCnxn: Attempting 
connection to server /10.10.1.58:2181
2009-10-20 10:53:40,005 INFO org.apache.zookeeper.ClientCnxn: Priming 
connection to java.nio.channels.SocketChannel[connected local=/10.10.1.58:56100 
remote=/10.10.1.58:2181]
2009-10-20 10:53:40,006 INFO org.apache.zookeeper.ClientCnxn: Server connection 
successful
2009-10-20 10:53:40,009 DEBUG org.apache.hadoop.hbase.master.HMaster: Got event 
None with path null
2009-10-20 10:53:40,012 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Wrote master address 
10.10.1.58:60000 to ZooKeeper
2009-10-20 10:53:40,016 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master 
got 10.10.1.58:60000
2009-10-20 10:53:40,017 DEBUG org.apache.hadoop.hbase.master.HMaster: Checking 
cluster state...
2009-10-20 10:53:40,017 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode 
/hbase/root-region-server got 10.10.1.58:60020
2009-10-20 10:53:40,019 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode 
/hbase/rs/1256061062528 got 10.10.1.58:60020
2009-10-20 10:53:40,019 INFO org.apache.hadoop.hbase.master.HMaster: This is a 
failover, ZK inspection begins...
2009-10-20 10:53:40,020 DEBUG org.apache.hadoop.hbase.master.HMaster: 
Inspection found server 10.10.1.58
2009-10-20 10:53:40,022 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Updated ZNode 
/hbase/rs/1256061062528 with data 10.10.1.58:60020
2009-10-20 10:53:40,028 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: SetData of ZNode 
/hbase/root-region-server with 10.10.1.58:60020
2009-10-20 10:53:40,029 INFO org.apache.hadoop.hbase.master.HMaster: Inspection 
found 3 regions, with -ROOT-
2009-10-20 10:53:40,029 INFO org.apache.hadoop.hbase.master.HMaster: Found log 
folder : 10.10.1.58,60020,1256061062528
2009-10-20 10:53:40,029 INFO org.apache.hadoop.hbase.master.HMaster: Log folder 
belongs to an existing region server
2009-10-20 10:53:40,029 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down
2009-10-20 10:54:38,601 INFO org.apache.hadoop.hbase.master.ServerManager: 1 
region servers, 0 dead, average load 3.0
2009-10-20 10:54:38,602 INFO org.apache.hadoop.hbase.master.BaseScanner: 
RegionManager.rootScanner scanning meta region {server: 10.10.1.58:60020, 
regionname: -ROOT-,,0, startKey: <>}
2009-10-20 10:54:38,607 INFO org.apache.hadoop.hbase.master.BaseScanner: 
RegionManager.metaScanner scanning meta region {server: 10.10.1.58:60020, 
regionname: .META.,,1, startKey: <>}
2009-10-20 10:54:38,611 INFO org.apache.hadoop.hbase.master.BaseScanner: 
RegionManager.rootScanner scan of 1 row(s) of meta region {server: 
10.10.1.58:60020, regionname: -ROOT-,,0, startKey: <>} complete
2009-10-20 10:54:38,615 INFO org.apache.hadoop.hbase.master.BaseScanner: 
RegionManager.metaScanner scan of 1 row(s) of meta region {server: 
10.10.1.58:60020, regionname: .META.,,1, startKey: <>} complete
2009-10-20 10:54:38,615 INFO org.apache.hadoop.hbase.master.BaseScanner: All 1 
.META. region(s) scanned
{code}

> When the Master's session times out and there's only one, cluster is wedged
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-1921
>                 URL: https://issues.apache.org/jira/browse/HBASE-1921
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.2, 0.21.0
>
>         Attachments: HBASE-1921.patch
>
>
> On IRC, some fella had a session expiration on his Master and had only one. 
> Maybe in this case the Master should first try to re-get the znode?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to