taklwu commented on pull request #2113:
URL: https://github.com/apache/hbase/pull/2113#issuecomment-662730825


   while waiting for the unit tests runs, I want to bring up two extra topics 
and may follow on new JIRA(s) 
   
   
   1. We reverted a 
[change](https://github.com/apache/hbase/commit/4d5efec76718032a1e55024fd5133409e4be3cb8#diff-21659161b1393e6632730dcbea205fd8R74-R75)
 in 
[HBASE-24471](https://github.com/apache/hbase/commit/4d5efec76718032a1e55024fd5133409e4be3cb8)
 that always deletes existing meta table if we're restarting on a fresh cluster 
with No WALs and No ZK data. I'm wondered if @Apache9 added this meta table 
removal for a special requirement on branch-2.3+, and that was the major 
behavior changes between branch-2.2 (it didn't delete meta if exists) and 
branch-2.3+. Here, should we add a feature flag to enable this meta directory 
removal ? IMO migration from an cluster with existing meta table and other 
tables may fail and need HBCK to repair region states (pending unit test suite 
completes to prove our change is safe). 
   
   
   2. With/Without this PR, I found a potential master cannot initialize issues 
and could be a bug on a dynamic hostname environment. If we only keep ZK data 
and has no WALs support, the location of meta table have the old hostname, and 
it hangs and waits for meta region to be online on that old hosts. however, it 
cannot be online because InitMetaProcedure cannot be submitted as meta region 
considers as `OPEN` and blocks by the condition of [`if (rs != null && 
rs.isOffline()) 
{)`](https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java#L1051-L1060).
 Normally, if WALs exist, the missing server will be expires and meta region 
will come online after the SCP handled that dead server. is this behavior as 
expected? do you guys think we should support this corner case?
   
   ```
   ### for case 2.
   2020-07-22 13:16:05,802 INFO  [master/localhost:0:becomeActiveMaster] 
master.HMaster(1020): hbase:meta {1588230740 state=OPEN, ts=1595448965762, 
server=localhost,54945,1595448957980}
   ...
   2020-07-22 15:04:33,802 WARN  [master/localhost:0:becomeActiveMaster] 
master.HMaster(1230): hbase:meta,,1.1588230740 is NOT online; state={1588230740 
state=OPEN, ts=1595455438210, server=localhost,62506,1595455430742}; 
ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern 
until region onlined.
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to