[ 
https://issues.apache.org/jira/browse/HBASE-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553284#comment-15553284
 ] 

Jonathan Hsieh edited comment on HBASE-16775 at 10/6/16 9:49 PM:
-----------------------------------------------------------------

Based off of what you are saying it would be better to just always a 
deterministic failures when in testing mode so that we always exercise the 
failure recovery paths of the manifests.

This probably could be done differently than it is in the current code.   
Realistically it might be better to have a fault / failure mid file copy 
instead of per map so that we exercise the "sameFile" optimization in the case 
of a failure or a retry. [1]

[1] 
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#L262




was (Author: jmhsieh):
Based off of what you are saying it would be better to just always a 
deterministic failures when in testing mode so that we always exercise the 
failure recovery paths of the manifests.

This probably could be done differently than it is in the current code -- a 
subclass of ExportSnapshot that failes every nth attempt.  Realistically it 
might be interested to have a fault / failure mid file copy instead of per map 
so that we exercise the "sameFile" coptimization in the case of a failure or a 
retry. [1]

[1] 
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#L262



> Flakey test with TestExportSnapshot#testExportRetry and 
> TestMobExportSnapshot#testExportRetry 
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16775
>                 URL: https://issues.apache.org/jira/browse/HBASE-16775
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: huaxiang sun
>            Assignee: huaxiang sun
>
> The root cause is that conf.setInt("mapreduce.map.maxattempts", 10) is not 
> taken by the mapper job, so the retry is actually 0. Debugging to see why 
> this is the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to