Re: Server Crash Recovery (WAS --> Re: ANN: The third hbase 0.94.0 release candidate is available for download)

Stack Wed, 16 May 2012 12:38:25 -0700

On Wed, May 16, 2012 at 12:25 AM, Mikael Sitruk <mikael.sit...@gmail.com> wrote:
> Hey St.Ack
>
> Thanks for clarifications,
>
> For 4. replay of log: (Please correct if i'm wrong)
> So the RS will:
> a. split the log via HLogSplitter, write concurrently log content to other
> log files under each region,


Yes.

The next bit is not right.  All of a servers logs have to finish
splitting before any of its regions will be assigned.   So, interject
into your narrative....

a'. When all regionservers have finished splitting the crashed
servers's logs, the master will assign out the regions.

Then, each regionserver that receives one of these regions, it will
notice that on open, before onlining, that the region has edits to
replay from a log split.  It will then....


> b. replay those smaller logs into its own memstore and own logs (is it done
> when the region becomes on-line?)

We'll replay the edits into the memstore.  We'll then force a flush on
the region to create a new hfile of the recovered edits. Only then do
we clean away the edits file so that on next open, the replay does not
happen again (If we crash before the hfile is successfully flushed,
the edits will be in place for the next open elsewhere).

> During this replay the RS may be subject to log flushing, and to compaction
> (flush will create more store file that will reach the min/max compaction),
> and other regular background task.
>

Thats right.  The distributed splitting is background task that should
not adversely effect the foreground RS tasks.

> During this time (log replay) the RS also accepts other client requests (on
> its own regions, not the ones it got assigned) or they are blocked?
> In case the RS is handling other requests too, is there any priority for
> WAL replay?
>

RS should be working 'as usual' while the splitting is going on.

No priority as yet for WAL splitting.

St.Ack

Re: Server Crash Recovery (WAS --> Re: ANN: The third hbase 0.94.0 release candidate is available for download)

Reply via email to