[ 
https://issues.apache.org/jira/browse/HBASE-23286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012272#comment-17012272
 ] 

Michael Stack edited comment on HBASE-23286 at 1/9/20 11:57 PM:
----------------------------------------------------------------

[~zghao] does this work for you?

I enabled hbase.wal.split.to.hfile by setting it to true. I killed a few 
servers. The SCP logging shows this for the split log steps...
{code}
2020-01-09 22:17:55,346 DEBUG org.apache.hadoop.hbase.master.MasterWalManager: 
Renamed region directory: 
hdfs://nameservice1/hbase/genie/WALs/h5,16020,1578604825302-splitting
2020-01-09 22:17:55,347 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
dead splitlog workers [h5,16020,1578604825302]
2020-01-09 22:17:55,351 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
hdfs://nameservice1/hbase/genie/WALs/h5,16020,1578604825302-splitting dir is 
empty, no logs to split.
2020-01-09 22:17:55,355 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
Finished splitting (more than or equal to) 0 (0 bytes) in 0 log files in 
[hdfs://nameservice1/hbase/genie/WALs/h5,16020,1578604825302-splitting] in 0ms
2020-01-09 22:17:55,356 DEBUG 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Done splitting 
WALs pid=123301, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, locked=true; 
ServerCrashProcedure server=h5,16020,1578604825302, splitWal=true, meta=false
{code}
The dir had 50 odd WALs in it but after above runs all are gone. Above runs too 
quickly. No instances of recovered.edits in my fs.

Let me look at patch...

Hmm... Patch changes RS side of splitter and Region open. Master logging should 
be same as before? Says zero. Undoing this for now.....


was (Author: stack):
[~zghao] does this work for you?

I enabled hbase.wal.split.to.hfile by setting it to true. I killed a few 
servers. The SCP logging shows this for the split log steps...
{code}
2020-01-09 22:17:55,346 DEBUG org.apache.hadoop.hbase.master.MasterWalManager: 
Renamed region directory: 
hdfs://nameservice1/hbase/genie/WALs/h5,16020,1578604825302-splitting
2020-01-09 22:17:55,347 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
dead splitlog workers [h5,16020,1578604825302]
2020-01-09 22:17:55,351 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
hdfs://nameservice1/hbase/genie/WALs/h5,16020,1578604825302-splitting dir is 
empty, no logs to split.
2020-01-09 22:17:55,355 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
Finished splitting (more than or equal to) 0 (0 bytes) in 0 log files in 
[hdfs://nameservice1/hbase/genie/WALs/h5,16020,1578604825302-splitting] in 0ms
2020-01-09 22:17:55,356 DEBUG 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Done splitting 
WALs pid=123301, state=RUNNABLE:SERVER_CRASH_SPLIT_LOGS, locked=true; 
ServerCrashProcedure server=h5,16020,1578604825302, splitWal=true, meta=false
{code}
The dir had 50 odd WALs in it but after above runs all are gone. Above runs too 
quickly. No instances of recovered.edits in my fs.

Let me look at patch...

> Improve MTTR: Split WAL to HFile
> --------------------------------
>
>                 Key: HBASE-23286
>                 URL: https://issues.apache.org/jira/browse/HBASE-23286
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Guanghao Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>
> After HBASE-20724, the compaction event marker is not used anymore when 
> failover. So our new proposal is split WAL to HFile to imporve MTTR. It has 3 
> steps:
>  # Read WAL and write HFile to region’s column family’s recovered.hfiles 
> directory.
>  # Open region.
>  # Bulkload the recovered.hfiles for every column family.
> The design doc was attathed by a google doc. Any suggestions are welcomed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to