[
https://issues.apache.org/jira/browse/HBASE-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12866402#action_12866402
]
Karthik Ranganathan commented on HBASE-2485:
--------------------------------------------
Hey Stack,
Excellent feedback, thanks!
1. Will add a modified doc soon, absolutely agree with you comment:
"intent is moving the intransitions out of Master to zk so intransiitions
weathers a master restart."
2. wrt Master restarts: jgray and I were discussing, it will be a scheme
similar to zk/UNASSIGNED, but in a different location. And a RS will be handed
a bunch of regions using one zk node update, and will have to ack the bulk open
in one zk node update. Will fill in once the details are clearer, but it will
not follow the exact same scheme.
3. Yes, closing is of no use.
4. Agreed
5. Yes, opening is a nice to have. I am taking the following approach: let the
RS report opening progress, but master will ignore them for the first cut.
6. No, I don't think that would be needed in the current scheme. The RS would
just update the state of the region to "OPENED" and master can infer from there.
We have already started coding some parts, will update once there is more
progress...
> Persist Master in-memory state so on restart or failover, new instance can
> pick up where the old left off
> ---------------------------------------------------------------------------------------------------------
>
> Key: HBASE-2485
> URL: https://issues.apache.org/jira/browse/HBASE-2485
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: Karthik Ranganathan
> Fix For: 0.20.5
>
> Attachments: HBase-State-Transitions.docx
>
>
> Today there was some good stuff up on IRC on how transitions won't always
> make it across Master failovers in multi-master deploy because transitions
> are kept in in-memory structure up in the Master and so on master crash, the
> new master will be missing state on startup (Todd was main promulgator of
> this observation and of the opinion that while master rewrite is scheduled
> for 0.21, some part needs to be done for 0.20.5). A few suggestions were
> made: transitions should be file-backed somehow, etc. Let this issue be
> about the subset we want to do for 0.20.5.
> Of the in-memory state queues, there is at least the master tasks queue --
> process region opens, closes, regionserver crashes, etc. -- where tasks must
> be done in order and IIRC, tasks are fairly idempotent (at least in the
> server crash case, its multi-step and we'll put the crash event back on the
> queue if we cannot do all steps in the one go). Perhaps this queue could be
> done using the new queue facility in zk 3.3.0 (I haven't looked to check if
> possible, just suggesting). Another suggestion was a file to which we'd
> append queue items, requeueing, and marking the file with task complete, etc.
> On Master restart or fail-over, we'd replay the queue log.
> There is also the Map of regions-in-transition. Yesterday we learned that
> there is a bug where server shutdown processing does not iterate the Map of
> regions-in-transition. This Map may hold regions that are in "opening" or
> "opened" state but haven't yet had the fact added to .META. by master.
> Meantime the hosting server can crash. Regions that were opening will stay
> in the regions-in-transition and those in opened-but-not-yet-added-to-meta
> will go ahead and add a crashed server to .META. (Currently
> regions-in-transition does not record server the region opening/open is
> happening on so it doesn't have enough info to be processed as part of server
> shutdown).
> Regions-in-transition also needs to be persistant. On startup,
> regions-in-transition can get kinda hectic on a big cluster. Ordering is not
> so important here I believe. A directory in zk might work (For 1M regions in
> a big cluster, that'd be about 2M creates and 2M deletes during startup --
> thats too much?). Or we could write a WAL-like log again of region
> transitions (We'd have to develop a little vocabulary) that got reread by a
> new master.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.