anoopsjohn commented on pull request #2237: URL: https://github.com/apache/hbase/pull/2237#issuecomment-673347350
Thanks Duo.. I was about to put the flow why we do InitMeta.. In case of the cluster recreate what @taklwu says, the WAL data itself not there as well as zk. So as what u said, if one has to do, then they have to do 2 things 1. When the initial cluster was dropped, before that the WAL fs (HDFS backed by managed disk in cloud) need to be backedup. Also the zk data (the meta server location) to be backed up 2. When the cluster recreated from existing data, some tool (hbck or some thing) need to recreate the zk node and put that old location value. Before that it has to recover back the WAL FS data onto the new HDFS cluster. The WAL fs data backup and restore make sense. But IMHO the zk data thing looks another hack. Till 2.1.6 this was not needed. Even if the meta location is not there zk, the init meta will kick in and that will create meta region from existing data and assign to some RS. But the meta cleanup make the existing entire data to be deleted. I think that is not good. We are adding more things to META table these days. The NS info itself is in another CF in META table.. There is some other discussion around adding the committed HFiles data into META (This is not concluded but looks like we keep increasing the responsibility of META table). So dropping all these information is not that good. So my thinking was to add such cluster recreate as a 1st class feature in HBase itself. This can be used by anyone anywhere. As long as the data is persisted, we can drop the cluster and recreate later. Now for that 2 blockers and the biggest one is this META dir delete as part of InitMetaProc. I agree to the intent of adding that cleanup. But now if we have to have this support also how can we avoid this delete? Recreate entire META using some hbck options should be the very last option to think off IMO. If we can solve other ways why not? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org