[jira] Commented: (HBASE-2485) Persist Master in-memory state so on restart or failover, new instance can pick up where the old left off

stack (JIRA) Fri, 07 May 2010 12:15:12 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865273#action_12865273
 ]


stack commented on HBASE-2485:
------------------------------

Doc is great.  Here's a few comments.

+ I think you should start your proposal w/ some high-level intents: e.g. Only 
messages from Master to RS over RPC are of import, are "commands"; messages 
from RS to Master are just informational (load, split) OR, intent is moving the 
intransitions out of Master to zk so intransiitions weathers a master restart.
+ Startup could be tricky.  Here we are hoisting all regions in .META. up into 
the unassigned in zk.  I was wondering about the case where the copy from 
.META. to zk/UNASSIGNED is only partially done say because master crashes.  
What happens?  Maybe it'll be OK?  If the  meta startcode does not match that 
of a running regionserver, then the region has not yet been assigned so add it 
to zk/UNASSIGNED.
+ In Close Region RS Flow, did we agree closing is of no use?   There is 
nothing master can do really if closing is taking for ever?
+ Up in zk, unfortunately, znodes will have to be named using the regions 
encoded name.  Will make it a little tough following region flow.  Perhaps the 
fix is to make encoded name of a region more prevalent in logs.
+ We said opening was nice to have rather than necessary?
+ I wonder if you need a new message from Master to RS where you can ask the RS 
what regions it has deployed? Be best if we didn't need it.  We shouldn't need 
it I suppose.

> Persist Master in-memory state so on restart or failover, new instance can 
> pick up where the old left off
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2485
>                 URL: https://issues.apache.org/jira/browse/HBASE-2485
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Karthik Ranganathan
>             Fix For: 0.20.5
>
>         Attachments: HBase-State-Transitions.docx
>
>
> Today there was some good stuff up on IRC on how transitions won't always 
> make it across Master failovers in multi-master deploy because transitions 
> are kept in in-memory structure up in the Master and so on master crash, the 
> new master will be missing state  on startup (Todd was main promulgator of 
> this observation and of the opinion that while  master rewrite is scheduled 
> for 0.21, some part needs to be done for 0.20.5).  A few suggestions were 
> made: transitions should be file-backed somehow, etc.  Let this issue be 
> about the subset we want to do for 0.20.5.
> Of the in-memory state queues, there is at least the master tasks queue -- 
> process region opens, closes, regionserver crashes, etc. -- where tasks must 
> be done in order and IIRC, tasks are fairly idempotent (at least in the 
> server crash case, its multi-step and we'll put the crash event back on the 
> queue if we cannot do all steps in the one go).  Perhaps this queue could be 
> done using the new queue facility in zk 3.3.0 (I haven't looked to check if 
> possible, just suggesting).  Another suggestion was a file to which we'd 
> append queue items, requeueing, and marking the file with task complete, etc. 
>  On Master restart or fail-over, we'd replay the queue log.
> There is also the Map of regions-in-transition.  Yesterday we learned that 
> there is a bug where server shutdown processing does not iterate the Map of 
> regions-in-transition.  This Map may hold regions that are in "opening" or 
> "opened" state but haven't yet had the fact added to .META. by master.  
> Meantime the hosting server can crash.  Regions that were opening will stay 
> in the regions-in-transition and those in opened-but-not-yet-added-to-meta 
> will go ahead and add a crashed server to .META. (Currently 
> regions-in-transition does not record server the region opening/open is 
> happening on so it doesn't have enough info to be processed as part of server 
> shutdown).
> Regions-in-transition also needs to be persistant.  On startup, 
> regions-in-transition can get kinda hectic on a big cluster.  Ordering is not 
> so important here I believe.  A directory in zk might work (For 1M regions in 
> a big cluster, that'd be about 2M creates and 2M deletes during startup -- 
> thats too much?).  Or we could write a WAL-like log again of region  
> transitions (We'd have to develop a little vocabulary) that got reread by a 
> new master.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2485) Persist Master in-memory state so on restart or failover, new instance can pick up where the old left off

Reply via email to