[ https://issues.apache.org/jira/browse/HBASE-24833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tak-Lon (Stephen) Wu updated HBASE-24833: ----------------------------------------- Affects Version/s: 2.3.3 2.3.1 3.0.0-alpha-1 2.3.0 > Bootstrap should not delete the META table directory if it's not partial > ------------------------------------------------------------------------ > > Key: HBASE-24833 > URL: https://issues.apache.org/jira/browse/HBASE-24833 > Project: HBase > Issue Type: Umbrella > Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.3.1, 2.3.3 > Reporter: Tak-Lon (Stephen) Wu > Priority: Major > > this issues were discussed in > [PR#2113|https://github.com/apache/hbase/pull/2113] as part of HBASE-24286, > and it is a dependencies before we solve HBASE-24286. > The changes were introduced in [HBASE-24471 > |https://github.com/apache/hbase/commit/4d5efec76718032a1e55024fd5133409e4be3cb8#diff-21659161b1393e6632730dcbea205fd8R70-R89] > that partial meta was introduced and `partial` was defined as > InitMetaProcedure did not succeed and INIT_META_ASSIGN_META was not completed. > {code:java} > private static void writeFsLayout(Path rootDir, Configuration conf) throws > IOException { > LOG.info("BOOTSTRAP: creating hbase:meta region"); > FileSystem fs = rootDir.getFileSystem(conf); > Path tableDir = CommonFSUtils.getTableDir(rootDir, > TableName.META_TABLE_NAME); > if (fs.exists(tableDir) && !fs.delete(tableDir, true)) { > LOG.warn("Can not delete partial created meta table, continue..."); > } > {code} > however, in the cloud use case where HFiles store on S3, WALs store on HDFS, > ZK data are stored within the cluster, this partial meta becomes a block when > cluster recreate on existing HFiles; Here, Zk data and WALs cannot be > retained (HDFS was associated with cloud instance and was terminated > together) when cluster recreates on the flushed HFiles, and existing meta are > always considered as partial and deleted in `INIT_META_WRITE_FS_LAYOUT` > during bootstrap. As a result, the recreate cluster starts with a empty meta > table, either the cluster hangs during the master initialization (branch-2) > because table states of namespace table cannot be assigned, or starts as a > fresh cluster without any region assigned and table opens (may need HBCK to > rebuild the meta). > Potential solution suggested by Anoop > {quote}In case of HM start and the bootstrap we create the ClusterID and > write to FS and then to zk and then create the META table FS layout. So in a > cluster recreate, we will see clusterID is there in FS and also the META FS > layout but no clusterID in zk. Ya seems we can use this as indication for > cluster recreate over existing data. In HM start, this is some thing we need > to check at 1st itself and track. If this mode is true, later when (if) we do > INIT_META_WRITE_FS_LAYOUT , we should not delete the META dir. As part of the > Bootstrap when we write that proc to MasterProcWal, we can include this mode > (boolean) info also. This is a protobuf message anyways. So even if this HM > got killed and restarted (at a point where the clusterId was written to zk > but the Meta FS layout part was not reached) we can use the info added as > part of the bootstrap wal entry and make sure NOT to delete the meta dir. > {quote} > In this JIRA, we're going to fix the `partial` definition when we found > cluster ID was stored in HFiles but ZK were deleted or fresh on cluster > creates. -- This message was sent by Atlassian Jira (v8.3.4#803005)