[ 
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615458#comment-13615458
 ] 

Ted Yu commented on HBASE-7878:
-------------------------------

Interesting idea.

In SplitLogManager#splitLogDistributed(), we can issue lease recovery request 
after obtaining log file list:
{code}
    FileStatus[] logfiles = getFileList(logDirs, filter);
{code}
One intricacy I can think of is that such lease recovery request is issued from 
master. SplitLogWorker may encounter the following exception:
{code}
Caused by: org.apache.hadoop.hdfs.protocol.RecoveryInProgressException: Failed 
to close file /user/jenkins/hbase/TestHLog/hlogdir/hlog.1364388129638. Lease 
recovery is in progress. Try again later.
{code}
Note: RecoveryInProgressException isn't in hadoop 1.0
                
> recoverFileLease does not check return value of recoverLease
> ------------------------------------------------------------
>
>                 Key: HBASE-7878
>                 URL: https://issues.apache.org/jira/browse/HBASE-7878
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.95.0, 0.94.6
>            Reporter: Eric Newton
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.95.0, 0.98.0
>
>         Attachments: 7878.94, 7878-94.addendum, 7878-94.addendum2, 
> 7878-addendum.txt, 7878-trunk.addendum, 7878-trunk.addendum2, 
> 7878-trunk-v10.txt, 7878-trunk-v11-test.txt, 7878-trunk-v12.txt, 
> 7878-trunk-v13.txt, 7878-trunk-v14.txt, 7878-trunk-v15.patch, 
> 7878-trunk-v16.txt, 7878-trunk-v2.txt, 7878-trunk-v3.txt, 7878-trunk-v4.txt, 
> 7878-trunk-v5.txt, 7878-trunk-v6.txt, 7878-trunk-v7.txt, 7878-trunk-v8.txt, 
> 7878-trunk-v9.txt, 7878-trunk-v9.txt
>
>
> I think this is a problem, so I'm opening a ticket so an HBase person takes a 
> look.
> Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease 
> recovery for Accumulo after HBase's lease recovery.  During testing, we 
> experienced data loss.  I found it is necessary to wait until recoverLease 
> returns true to know that the file has been truly closed.  In FSHDFSUtils, 
> the return result of recoverLease is not checked. In the unit tests created 
> to check lease recovery in HBASE-2645, the return result of recoverLease is 
> always checked.
> I think FSHDFSUtils should be modified to check the return result, and wait 
> until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to