[ 
https://issues.apache.org/jira/browse/HBASE-23634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163438#comment-17163438
 ] 

Anoop Sam John commented on HBASE-23634:
----------------------------------------

bq.Then one issue is how the system can know whether it is a partially written 
file or a real corruption?
This is applicable when we do split wals to HFiles and during that there was 
some wal file split failed in between and reattempted.  End of the day there 
might be some HFiles which are duplicate.  Some HFiles may be incomplete.  Now 
while reading back the file (verification while loading to cf) we dont know 
whether this is a failed attempt's partial HFile or a real corrupted file once 
it was written to FS.    So on wal file split failure, a cleanup before next 
attempt is imp. Now one problem is there might be N WAL files and all splits 
will create the HFiles for a region:cf under same dir 
region/cf/recovered.edits.   Now if we wanted this cleanup, these files should 
have been generated with some way to identify them as result of which wal 
file's split.  Say if HFile was placed under region/cf/recovered.edits/<split 
wal file name> dir, we could have cleaned it up before doing the next attempt. 
Thoughts?

> Enable "Split WAL to HFile" by default
> --------------------------------------
>
>                 Key: HBASE-23634
>                 URL: https://issues.apache.org/jira/browse/HBASE-23634
>             Project: HBase
>          Issue Type: Task
>    Affects Versions: 3.0.0-alpha-1, 2.3.0
>            Reporter: Guanghao Zhang
>            Priority: Blocker
>             Fix For: 3.0.0-alpha-1
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to