[ 
https://issues.apache.org/jira/browse/TEZ-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14357592#comment-14357592
 ] 

Hitesh Shah commented on TEZ-1909:
----------------------------------

Minor comments: 

 - there can be cases where data is partially written hence there might be an 
error when reading the last record. Maybe we should add a simulated test for 
this by writing invalid data to the end of an intermediate summary and dag file 
and seeing whether the code handles it correctly?
 - skipAllOtherEvents should probably be a flag across all files for a given 
dag. At the moment, it is considered only for a single dag file and reset. 
 - log line "LOG.info("isSpeculationEnabled:" + isSpeculationEnabled);" was 
removed - not sure why. 

{code}
    for (int attemptNum=1; attemptNum<=3; ++attemptNum) {
              List<HistoryEvent> historyEvents = new ArrayList<HistoryEvent>();
              for (int i=1 ;i<=attemptNum;++i) {
                Path currentAttemptRecoveryDataDir = 
TezCommonUtils.getAttemptRecoveryPath(recoveryDataDir,i);
                Path recoveryFilePath = new Path(currentAttemptRecoveryDataDir,
                appId.toString().replace("application", "dag") + "_1" + 
TezConstants.DAG_RECOVERY_RECOVER_FILE_SUFFIX);
                historyEvents.addAll(RecoveryParser.parseDAGRecoveryFile(
                    fs.open(recoveryFilePath)));
              }
{code}

The above code needs a bit of cleanup - not sure why we need 2 loops for the 3 
attempts' recovery data. 



> Remove need to copy over all events from attempt 1 to attempt 2 dir
> -------------------------------------------------------------------
>
>                 Key: TEZ-1909
>                 URL: https://issues.apache.org/jira/browse/TEZ-1909
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Hitesh Shah
>            Assignee: Jeff Zhang
>         Attachments: TEZ-1909-1.patch
>
>
> Use of file versions should prevent the need for copying over data into a 
> second attempt dir. Care needs to be taken to handle "last corrupt record" 
> handling. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to