Yifeng, HLog splitting is used to when shutting down a region server or to recover log edits after a region server goes down uncleanly.
Prior to 0.92, if many region servers go down uncleanly (maybe hdfs went away under hbase), only a single process would do all log splitting in to recover. If you have a lot of nodes, each has a hlog to split this can take a long time (100 nodes = 100x the time). If you have many nodes going down uncleanly you could manually use the HLog --split mechanism you describe to split all the logs in parallel on many machines before restarting hbase. As of 0.92, this hlog splitting is automatically distributed -- the feature is called distributed log splitting. This farms work out the the nodes and should significantly speedup up recovery in these scenarios. Jon. On Fri, Dec 23, 2011 at 6:59 PM, Yifeng Jiang <uprushwo...@gmail.com> wrote: > Hi, > > As mentioned in HBase Book, we can force a manual hlog splitting by: > hbase org.apache.hadoop.hbase.regionserver.wal.HLog --split hdfs:// > example.org:8020/hbase/.logs/example.org,60020,1283516293161/ > > What's the use case of this manual splitting? > If RS is crashed, the hlog splitting will be triggered by master > automatically. > While, force a hlog splitting shuts down the RS if the RS is online. > > Thanks, > Yifeng -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // j...@cloudera.com