[ 
https://issues.apache.org/jira/browse/HBASE-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13615341#comment-13615341
 ] 

Nicolas Liochon commented on HBASE-7878:
----------------------------------------

bq. I understand that in case false is returned from recoverLease, we would 
wait longer. 
Yes, we were not waiting before so we were wrong but fast. It seems that the 
recovery takes 5s  

bq. One remedy I can think of is to bundle lease recovery for several files 
together so that the extra wait can be amortized. 
It seems to be a good idea. I continue to investigate (there are other issues 
as well) but this one seems to be a clear quick win. Because even if the 
recovery is immediate, we now wait 1s (the first call returns false, so we wait 
1s and then retry). 
Why is the leaseRecovery done by the regionserver vs. the master, btw? 
Moreover, I would also expect to have just a few WAL file opened on the hdfs 
side (the one for .meta., the current one, may be the previous one if we have 
just rolled?). We should call the lease recovery of these ones first may be?
                
> recoverFileLease does not check return value of recoverLease
> ------------------------------------------------------------
>
>                 Key: HBASE-7878
>                 URL: https://issues.apache.org/jira/browse/HBASE-7878
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.95.0, 0.94.6
>            Reporter: Eric Newton
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.95.0, 0.98.0
>
>         Attachments: 7878.94, 7878-94.addendum, 7878-94.addendum2, 
> 7878-addendum.txt, 7878-trunk.addendum, 7878-trunk.addendum2, 
> 7878-trunk-v10.txt, 7878-trunk-v11-test.txt, 7878-trunk-v12.txt, 
> 7878-trunk-v13.txt, 7878-trunk-v14.txt, 7878-trunk-v15.patch, 
> 7878-trunk-v16.txt, 7878-trunk-v2.txt, 7878-trunk-v3.txt, 7878-trunk-v4.txt, 
> 7878-trunk-v5.txt, 7878-trunk-v6.txt, 7878-trunk-v7.txt, 7878-trunk-v8.txt, 
> 7878-trunk-v9.txt, 7878-trunk-v9.txt
>
>
> I think this is a problem, so I'm opening a ticket so an HBase person takes a 
> look.
> Apache Accumulo has moved its write-ahead log to HDFS. I modeled the lease 
> recovery for Accumulo after HBase's lease recovery.  During testing, we 
> experienced data loss.  I found it is necessary to wait until recoverLease 
> returns true to know that the file has been truly closed.  In FSHDFSUtils, 
> the return result of recoverLease is not checked. In the unit tests created 
> to check lease recovery in HBASE-2645, the return result of recoverLease is 
> always checked.
> I think FSHDFSUtils should be modified to check the return result, and wait 
> until it returns true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to