taklwu edited a comment on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673253206


   first of all, thanks Duo again. 
   
   > I think for the scenario here, we just need to write the cluster id and 
other things to zookeeper? Just make sure that the current code in HBase will 
not consider us as a fresh new cluster. We do not need to rebuild meta?
   
   So, let me confirm your suggestion, that means if we add one more field in 
ZNode, e.g. a boolean `completedMetaBoostrap`, if we find both `clusterId` and 
`completedMetaBoostrap` in ZK, we will not delete meta directory ?
   
   followup if ZK Znode data is used to determine if this is a fresh new 
cluster, can we skip the delete meta directory if `clusterId` and 
`completedMetaBoostrap` are never set but we found meta directory?  this is the 
cloud use cases which we don't have ZK to make the decision; such we don't know 
if the meta is partial, and IMO, we should just leave the meta directory and if 
anything bad happens, the operator can still run HBCK. (if we do the other way 
around and always delete the meta, then we're losing the possibility the 
cluster can heal itself, and we cannot confirm if this is partial, doesn't it?)
   
    > For the InitMetaProcedure, the assumption is that, if we found that the 
meta table directory is there, then it means the procedure itself has crashed 
before finishing the creation of meta table, i.e, the meta table is 'partial'. 
So it is safe to just remove it and create again. I think this is a very common 
trick in distributed system for handling failures?
   
   do you mean `idempotent` is the `trick` ? `InitMetaProcedure` may be 
idempotent and can make `hbase:meta` online (as a empty table), but I don't 
think if the cluster/HM itself is `idempotent` automatically; and yeah, it can 
rebuild the data content of the original meta with the help of HBCK, but just 
if HM continues the flow with some existing data, e.g. the namespace table 
(sorry for branch-2 we have namespace table) and HM restart with a empty meta, 
based on the experiment I did, the cluster hangs and HM cannot be initialized. 
   
   if we step back to just think on the definition of `partial` meta, it would 
be great if the meta table itself can tell if it's partial, because it's still 
a table in HBase and HFiles are immutable. e.g. can we tell if a user table is 
partial by looking at its data? I may be wrong, but it seems like we're not 
able to tell from HFiles itself, and we need ZK and WAL to define it.  
   
   So, again, IMO data content in a table is sensitive ([updated] sorry if you 
guys think data in meta table is not sensitive), I'm proposing not to delete 
meta directory if possible (it's also like running a hbck to delete and 
rebuild). 
   
   Based on our discussion here, IMO we have two proposal mentioned to define 
`partial meta` . 
   
   1. add a boolean in WAL like a proc-level data
   2. write a boolean in ZNode to tell if the bootstrap completes
   *. no matter we choose 1) and 2) above, we have an additional condition, if 
we don't find any WAL or ZK about this condition, we should not delete the meta 
table. 
   
   seems 2) + *) should be the simplest solution, what do you guys think?  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to