[ 
https://issues.apache.org/jira/browse/HBASE-13200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357973#comment-14357973
 ] 

He Liangliang commented on HBASE-13200:
---------------------------------------

Yes, we encountered this issue in prod then make this fix.
This fix is aimed to avoid the worst case, although it means something else 
might be wrong if it reaches the 3rd retry.

> Improper configuration can leads to endless lease recovery during failover
> --------------------------------------------------------------------------
>
>                 Key: HBASE-13200
>                 URL: https://issues.apache.org/jira/browse/HBASE-13200
>             Project: HBase
>          Issue Type: Bug
>          Components: MTTR
>            Reporter: He Liangliang
>            Assignee: He Liangliang
>         Attachments: HBASE-13200.patch
>
>
> When a node (DN+RS) has machine/OS level failure, another RS will try to do 
> lease recovery for the log file. It will retry for every 
> hbase.lease.recovery.dfs.timeout (default to 61s) from the second time. When 
> the hdfs configuration is not properly configured (e.g. socket connection 
> timeout) and without patch HDFS-4721, the lease recovery time can exceeded 
> the timeout specified by hbase.lease.recovery.dfs.timeout. This will lead to  
> endless retries and preemptions until the final timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to