anoopsjohn commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673347350


   Thanks Duo..  I was about to put the flow why we do InitMeta..  In case of 
the cluster recreate what @taklwu says, the WAL data itself not there as well 
as zk.  So as what u said, if one has to do, then they have to do 2 things
   1. When the initial cluster was dropped, before that the WAL fs (HDFS backed 
by managed disk in cloud) need to be backedup. Also the zk data (the meta 
server location) to be backed up
   2. When the cluster recreated from existing data, some tool (hbck or some 
thing) need to recreate the zk node and put that old location value. Before 
that it has to recover back the WAL FS data onto the new HDFS cluster.
   The WAL fs data backup and restore make sense. But IMHO the zk data thing 
looks another hack. Till 2.1.6 this was not needed. Even if the meta location 
is not there zk, the init meta will kick in and that will create meta region 
from existing data and assign to some RS.  But the meta cleanup make the 
existing entire data to be deleted. I think that is not good. We are adding 
more things to META table these days.  The NS info itself is in another CF in 
META table.. There is some other discussion around adding the committed HFiles 
data into META (This is not concluded but looks like we keep increasing the 
responsibility of META table).   So dropping all these information is not that 
good. 
   So my thinking was to add such cluster recreate as a 1st class feature in 
HBase itself. This can be used by anyone anywhere.  As long as the data is 
persisted, we can drop the cluster and recreate later.   Now for that 2 
blockers and the biggest one is this META dir delete as part of InitMetaProc.  
I agree to the intent of adding that cleanup.  But now if we have to have this 
support also how can we avoid this delete?   Recreate entire META using some 
hbck options should be the very last option to think off IMO. If we can solve 
other ways why not?
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to