[ 
https://issues.apache.org/jira/browse/HBASE-21035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611629#comment-16611629
 ] 

stack commented on HBASE-21035:
-------------------------------

.002 is a hack on top of [~allan163] 's patch. I used it doing fixup on a 
cluster here, one where I had to remove procedure WALs because too many too 
process and when done, a bunch of state needed repair. It includes some 
miscellaneous but main thing is assign of meta and of namespace so master 
startup can continue (without these, master exits because it can't scan a meta, 
and later a namespace table).

I've also been playing around with a FixMetaProcedure that does force online. 
Tricky part is finding all the meta WALs and then doing the inline split. 
Trying to see if I should remove meta files when done and trying to figure how 
to fence off the meta if it open already.

Will be back later.

> Meta Table should be able to online even if all procedures are lost
> -------------------------------------------------------------------
>
>                 Key: HBASE-21035
>                 URL: https://issues.apache.org/jira/browse/HBASE-21035
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.1.0
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>         Attachments: HBASE-21035.branch-2.0.001.patch, 
> HBASE-21035.branch-2.1.001.patch
>
>
> After HBASE-20708, we changed the way we init after master starts. It will 
> only check WAL dirs and compare to Zookeeper RS nodes to decide which server 
> need to expire. For servers which's dir is ending with 'SPLITTING', we assure 
> that there will be a SCP for it.
> But, if the server with the meta region crashed before master restarts, and 
> if all the procedure wals are lost (due to bug, or deleted manually, 
> whatever), the new restarted master will be stuck when initing. Since no one 
> will bring meta region online.
> Although it is an anomaly case, but I think no matter what happens, we need 
> to online meta region. Otherwise, we are sitting ducks, noting can be done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to