Tak-Lon (Stephen) Wu created HBASE-24833:
--------------------------------------------

             Summary: Bootstrap should not delete the META table directory if 
it's not partial
                 Key: HBASE-24833
                 URL: https://issues.apache.org/jira/browse/HBASE-24833
             Project: HBase
          Issue Type: Umbrella
            Reporter: Tak-Lon (Stephen) Wu


this issues were discussed in 
[PR#2113|https://github.com/apache/hbase/pull/2113] as part of HBASE-24286, and 
it is a dependencies before we solve HBASE-24286.


The changes were introduced in [HBASE-24471 
|https://github.com/apache/hbase/commit/4d5efec76718032a1e55024fd5133409e4be3cb8#diff-21659161b1393e6632730dcbea205fd8R70-R89]
 that partial meta was introduced and `partial` was defined as 
InitMetaProcedure did not succeed and INIT_META_ASSIGN_META was not completed. 

{{ private static void writeFsLayout(Path rootDir, Configuration conf) throws 
IOException { 
   LOG.info("BOOTSTRAP: creating hbase:meta region"); 
   FileSystem fs = rootDir.getFileSystem(conf); 
   Path tableDir = CommonFSUtils.getTableDir(rootDir, 
TableName.META_TABLE_NAME); 
   if (fs.exists(tableDir) && !fs.delete(tableDir, true)) { 
     LOG.warn("Can not delete partial created meta table, continue..."); 
   } }}

however, in the cloud use case where HFiles store on S3, WALs store on HDFS, ZK 
data are stored within the cluster. Here, Zk data and WALs cannot be retained 
(HDFS was associated with cloud instance and was terminated together) when 
cluster recreates on the flushed HFiles, and existing meta are always 
considered as partial and deleted in `INIT_META_WRITE_FS_LAYOUT` during 
bootstrap. as a result, the recreate cluster starts with a empty meta table, 
either the cluster hangs during the master initialization (branch-2) because 
table states of namespace table cannot be assigned, or starts as a fresh 
cluster without any region assigned and table opens (may need HBCK to rebuild 
the meta).  

Potential solution suggested by Anoop

bq. In case of HM start and the bootstrap we create the ClusterID and write to 
FS and then to zk and then create the META table FS layout. So in a cluster 
recreate, we will see clusterID is there in FS and also the META FS layout but 
no clusterID in zk. Ya seems we can use this as indication for cluster recreate 
over existing data. In HM start, this is some thing we need to check at 1st 
itself and track. If this mode is true, later when (if) we do 
INIT_META_WRITE_FS_LAYOUT , we should not delete the META dir. As part of the 
Bootstrap when we write that proc to MasterProcWal, we can include this mode 
(boolean) info also. This is a protobuf message anyways. So even if this HM got 
killed and restarted (at a point where the clusterId was written to zk but the 
Meta FS layout part was not reached) we can use the info added as part of the 
bootstrap wal entry and make sure NOT to delete the meta dir.



In this JIRA, we're going to fix the `partial` definition when we found cluster 
ID was stored in HFiles but ZK were deleted or fresh on cluster creates. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to