[ https://issues.apache.org/jira/browse/HBASE-23286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995492#comment-16995492 ]
Guanghao Zhang commented on HBASE-23286: ---------------------------------------- Test the patch in a cluster which has 5 regionserver. And when the memstore size is 10G and kill one regionserver. || || Split WAL Size||Split WAL Number||Split WAL Cost||Assign Region Number||Assign Region Cost||ServerCrashProcedure Cost|| ||[hbase.wal.split.to|http://hbase.wal.split.to/].hfile=false||10.5G||22||41739ms||39||min 2.565 sec max 40.146 sec avg 36.358 sec||1 mins, 22.06 sec|| ||[hbase.wal.split.to|http://hbase.wal.split.to/].hfile=true||10.0G||21||42382ms||40||min 0.814 sec max 2.696 sec avg 1.470 sec||45.3490 sec || As there are 4 other regionservers which can take the split WAL task. The split WAL cost less time than the previous test. > Improve MTTR: Split WAL to HFile > -------------------------------- > > Key: HBASE-23286 > URL: https://issues.apache.org/jira/browse/HBASE-23286 > Project: HBase > Issue Type: Improvement > Components: MTTR > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang > Priority: Major > > After HBASE-20724, the compaction event marker is not used anymore when > failover. So our new proposal is split WAL to HFile to imporve MTTR. It has 3 > steps: > # Read WAL and write HFile to region’s column family’s recovered.hfiles > directory. > # Open region. > # Bulkload the recovered.hfiles for every column family. > The design doc was attathed by a google doc. Any suggestions are welcomed. -- This message was sent by Atlassian Jira (v8.3.4#803005)