[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559422#comment-13559422
 ] 

Siddharth Seth commented on MAPREDUCE-4838:
-------------------------------------------

Thanks for working on this Zhijie.
bq. There's no JobInProgress (actually nearly empty) and TaskInProgress, where 
the locality ant the avataar attributes are set and logged. Instead, "avataar" 
is now set in TaskImpl#addAndScheduleAttempt by judging whether there are other 
active task attempts, while "locality" is set in 
TaskAttemptImpl#ContainerAssignedTransition#transition by judging whether the 
assigned container's host is within the local host/rack list of the task 
attempt.
Right. I think you've got these, as well as the other changes mapped to the 
correct places for trunk.

Comments on the patch
- Some lines exceed the 80 column limit. (Coding style guidelines at 
http://wiki.apache.org/hadoop/CodeReviewChecklist)
- Job specific configuration strings like "mapreduce.workflow.id" etc should be 
in MRJobConfig
- TaskAttemptImpl - Default locality set to NODE_LOCAL. Should be OFF_SWITCH to 
match branch-1.
- In TaskAttemptImpl, avoid resolving the host names multiple times.
- In TaskImpl, SPECULATIVE should be set when the RedundantScheduleTransition 
is taken. Not for retires caused by FAILED / KILLED attempts. The same likely 
applies to the branch-1 patch.
- In the history events (JobSubmittedEvent etc) - null check on the new 
strings. (the Utf8 constructor does not work with nulls)
- The toString implementation in the history events, as well as the Rumen 
events is not really needed. If implemented, they should include additional 
fields.

The additional information may be useful to expose - via the UI at least - in 
which case it'll need to be exposed via TaskAttemptReport. This can be done in 
a follow up jira. For now, Locality and Avatar enums could be moved to mrv2 - 
the hadoop-mapreduce-client-common module (ref TaskType).
                
> Add extra info to JH files
> --------------------------
>
>                 Key: MAPREDUCE-4838
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Arun C Murthy
>            Assignee: Zhijie Shen
>         Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838.patch
>
>
> It will be useful to add more task-info to JH for analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to