Han Xiao created HBASE-10237: -------------------------------- Summary: Master restart, then followed by the MetaRegionServer crashed will result that the .meta. table won't online forever Key: HBASE-10237 URL: https://issues.apache.org/jira/browse/HBASE-10237 Project: HBase Issue Type: Bug Affects Versions: 0.94.12 Reporter: Han Xiao
The following logs record such process: // *once allocate root and meta to node209* 2013-12-23 10:45:34,130 INFO \[MASTER_OPEN_REGION-node201.vipcloud,60000,1386903739776-3\] handler.OpenedRegionHandler (OpenedRegionHandler.ja va:debugLog(145)) - Handling OPENED event for .META.,,1.1028785192 from node209.vipcloud,60020,1387272038024; deleting unassigned node 2013-12-23 14:53:36,268 INFO \[MASTER_OPEN_REGION-node201.vipcloud,60000,1386903739776-4\] handler.OpenedRegionHandler (OpenedRegionHandler.java:debugLog(145)) - Handling OPENED event for \-ROOT\-,,0.70236052 from node209.vipcloud,60020,1387272038024; deleting unassigned node // *master restart* 2013-12-23 16:30:19 CST Starting master on node201.vipcloud // *209 comming* 2013-12-23 16:30:33,698 INFO \[master-node201.vipcloud,60000,1387787422616\] master.ServerManager (ServerManager.java:recordNewServer(280)) - Registering server=node209.vipcloud,60020,1387272038024 // *209 out* 2013-12-23 16:30:37,106 INFO \[main-EventThread\] zookeeper.RegionServerTracker (RegionServerTracker.java:nodeDeleted(93)) - RegionServer ephe meral node deleted, processing expiration \[node209.vipcloud,60020,1387272038024\] // *delay processing 209 for initialization* 2013-12-23 16:30:37,107 INFO \[main-EventThread\] master.ServerManager (ServerManager.java:expireServer(384)) - Master doesn't enable ServerSh utdownHandler during initialization, delay expiring server node209.vipcloud,60020,1387272038024 // *assign root to node209 for data in zk node* 2013-12-23 16:30:42,120 INFO \[master-node201.vipcloud,60000,1387787422616\] master.HMaster (HMaster.java:assignRoot(756)) - \-ROOT\- assigned=0 , rit=false, location=node209.vipcloud,60020,1387272038024 // *problem happened when assign META to node209, validation passed first but when check server available failed. Therefore not update the regions struct for META* s1:2013-12-23 16:30:42,322 INFO \[master-node201.vipcloud,60000,1387787422616\] master.AssignmentManager (AssignmentManager.java:regionOnline(126 4)) - The server is not in online servers, ServerName=node209.vipcloud,60020,1387272038024, region=1028785192 s2:2013-12-23 16:30:42,323 INFO \[master-node201.vipcloud,60000,1387787422616\] master.HMaster (HMaster.java:assignMeta(814)) - .META. assigned=0 , rit=false, location=node209.vipcloud,60020,1387272038024 // *handle node209 shutdown, only do reassign for \-ROOT\- but not for .META. because it is NOT updated in memory* 2013-12-23 16:31:35,978 INFO \[MASTER_META_SERVER_OPERATIONS-node201.vipcloud,60000,1387787422616-0\] handler.MetaServerShutdownHandler (MetaS erverShutdownHandler.java:process(78)) - Server node209.vipcloud,60020,1387272038024 was carrying ROOT. Trying to assign. // *verifaction for META failed (data in \-ROOT\- still stored as node209), but no reassign. for META. META won't be online forever* 2013-12-23 16:31:40,048 INFO \[MASTER_SERVER_OPERATIONS-node201.vipcloud,60000,1387787422616-0\] catalog.CatalogTracker (CatalogTracker.java: verifyRegionLocation(582)) - Failed verification of .META.,,1 at address=node209.vipcloud,60020,1387272038024; org.apache.hadoop.hbase.ipc.HBa seClient$FailedServerException: This server is in the failed servers list: node209.vipcloud/192.168.30.132:60020 2013-12-23 16:35:46,764 INFO \[MASTER_SERVER_OPERATIONS-node201.vipcloud,60000,1387787422616-0\] catalog.CatalogTracker (CatalogTracker.java:v erifyRegionLocation(582)) - Failed verification of .META.,,1 at address=node209.vipcloud,60020,1387272038024; org.apache.hadoop.hbase.ipc.HBa seClient$FailedServerException: This server is in the failed servers list: node209.vipcloud/192.168.30.132:60020 -- This message was sent by Atlassian JIRA (v6.1.5#6160)