[ https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254159#comment-13254159 ]
Jonathan Hsieh commented on HBASE-5781: --------------------------------------- I looked into this briefly. The version I've used on production systems doesn't have this finally/close portion in it. HBASE-3777 (on trunk/0.92/0.94) and HBASE-4508 (backport of HBASE-3777) on 0.90 added this. I think if you remove the extra try-finally-close the -fix will work again, (but may leak resources). Can you give the modification a try? > Zookeeper session got closed while trying to assign the region to RS using > hbck -fix > ------------------------------------------------------------------------------------ > > Key: HBASE-5781 > URL: https://issues.apache.org/jira/browse/HBASE-5781 > Project: HBase > Issue Type: Bug > Components: hbck > Affects Versions: 0.94.0 > Reporter: Kristam Subba Swathi > Assignee: Jonathan Hsieh > > After running the hbck in the cluster ,it is found that one region is not > assigned > So the hbck -fix is used to fix this > But the assignment didnt happen since the zookeeper session is closed > Please find the attached trace for more details > ----------------------------------------- > Trying to fix unassigned region... > 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, > waiting for it to become assigned: {NAME => > 'ufdr,002300,1333379123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => > '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,} > 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: > Closed zookeeper sessionid=0x236738a2630000a > 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a2630000a closed > ERROR: Region { meta => > ufdr,010444,1333379123857.01594219211d0035b9586f98954462e1., hdfs => > hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed > => } not deployed on any region server. > Trying to fix unassigned region... > 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down > 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a2630000a Unable > to set watcher on znode (/hbase) > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase > at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) > at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) > at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) > at > org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109) > at > org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92) > at > org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235) > at > org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351) > at > org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114) > at > org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356) > at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375) > at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894) > 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: > hconnection-0x236738a2630000a Received unexpected KeeperException, > re-throwing exception > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /hbase > at org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626) > at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) > at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) > at > org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109) > at > org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92) > at > org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235) > at > org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351) > at > org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114) > at > org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356) > at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375) > at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894) > 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: > This client just lost it's session with ZooKeeper, trying to reconnect. > 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: > Trying to reconnect to zookeeper > 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=10.18.40.21:2181,10.18.40.25:2181,10.18.40.93:2181 > sessionTimeout=60000 watcher=hconnection > 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: Opening socket connection to > server /10.18.40.93:2181 > 12/04/03 11:02:58 INFO zookeeper.RecoverableZooKeeper: The identifier of this > process is 18333@HOST-10-18-40-93 > 12/04/03 11:02:58 WARN client.ZooKeeperSaslClient: SecurityException: > java.lang.SecurityException: Unable to locate a login configuration occurred > when trying to find JAAS configuration. > 12/04/03 11:02:58 INFO client.ZooKeeperSaslClient: Client will not > SASL-authenticate because the default JAAS configuration section 'Client' > could not be found. If you are not using SASL, you may ignore this. On the > other hand, if you expected SASL to work, please fix your JAAS configuration. > 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: Socket connection established to > HOST-10-18-40-93/10.18.40.93:2181, initiating session > 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: Session establishment complete > on server HOST-10-18-40-93/10.18.40.93:2181, sessionid = 0x3367392d5140018, > negotiated timeout = 40000 > 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: > Reconnected successfully. This disconnect could have been caused by a network > partition or a long-running GC pause, either way it's recommended that you > verify your environment. > Exception in thread "main" org.apache.hadoop.hbase.MasterNotRunningException > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:686) > at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211) > at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325) > at > org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109) > at > org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92) > at > org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235) > at > org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351) > at > org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114) > at > org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356) > at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375) > at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894) > Please find the attached file for more details.. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira