Yifeng,

HLog splitting is used to when shutting down a region server or to recover
log edits after a region server goes  down uncleanly.

Prior to 0.92, if many region servers go down uncleanly (maybe hdfs went
away under hbase), only a single process would do all log splitting in to
recover.   If you have a lot of nodes, each has a hlog to split this can
take a long time (100 nodes = 100x the time). If you have many nodes going
down uncleanly you could manually use the HLog --split mechanism you
describe to split all the logs in parallel on many machines before
restarting hbase.

As of 0.92, this hlog splitting is automatically distributed -- the feature
is called distributed log splitting.  This farms work out the the nodes and
should significantly speedup up recovery in these scenarios.

Jon.

On Fri, Dec 23, 2011 at 6:59 PM, Yifeng Jiang <uprushwo...@gmail.com> wrote:

> Hi,
>
> As mentioned in HBase Book, we can force a manual hlog splitting by:
> hbase org.apache.hadoop.hbase.regionserver.wal.HLog --split hdfs://
> example.org:8020/hbase/.logs/example.org,60020,1283516293161/
>
> What's the use case of this manual splitting?
> If RS is crashed, the hlog splitting will be triggered by master
> automatically.
> While, force a hlog splitting shuts down the RS if the RS is online.
>
> Thanks,
> Yifeng




-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// j...@cloudera.com

Reply via email to