taklwu commented on pull request #2113: URL: https://github.com/apache/hbase/pull/2113#issuecomment-662730825
while waiting for the unit tests runs, I want to bring up two extra topics and may follow on new JIRA(s) 1. We reverted a [change](https://github.com/apache/hbase/commit/4d5efec76718032a1e55024fd5133409e4be3cb8#diff-21659161b1393e6632730dcbea205fd8R74-R75) in [HBASE-24471](https://github.com/apache/hbase/commit/4d5efec76718032a1e55024fd5133409e4be3cb8) that always deletes existing meta table if we're restarting on a fresh cluster with No WALs and No ZK data. I'm wondered if @Apache9 added this meta table removal for a special requirement on branch-2.3+, and that was the major behavior changes between branch-2.2 (it didn't delete meta if exists) and branch-2.3+. Here, should we add a feature flag to enable this meta directory removal ? IMO migration from an cluster with existing meta table and other tables may fail and need HBCK to repair region states (pending unit test suite completes to prove our change is safe). 2. With/Without this PR, I found a potential master cannot initialize issues and could be a bug on a dynamic hostname environment. If we only keep ZK data and has no WALs support, the location of meta table have the old hostname, and it hangs and waits for meta region to be online on that old hosts. however, it cannot be online because InitMetaProcedure cannot be submitted as meta region considers as `OPEN` and blocks by the condition of [`if (rs != null && rs.isOffline()) {)`](https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1051-L1060). Normally, if WALs exist, the missing server will be expires and meta region will come online after the SCP handled that dead server. is this behavior as expected? do you guys think we should support this corner case? ``` ### for case 2. 2020-07-22 13:16:05,802 INFO [master/localhost:0:becomeActiveMaster] master.HMaster(1020): hbase:meta {1588230740 state=OPEN, ts=1595448965762, server=localhost,54945,1595448957980} ... 2020-07-22 15:04:33,802 WARN [master/localhost:0:becomeActiveMaster] master.HMaster(1230): hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1595455438210, server=localhost,62506,1595455430742}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined. ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org