[
https://issues.apache.org/jira/browse/MAPREDUCE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559422#comment-13559422
]
Siddharth Seth commented on MAPREDUCE-4838:
-------------------------------------------
Thanks for working on this Zhijie.
bq. There's no JobInProgress (actually nearly empty) and TaskInProgress, where
the locality ant the avataar attributes are set and logged. Instead, "avataar"
is now set in TaskImpl#addAndScheduleAttempt by judging whether there are other
active task attempts, while "locality" is set in
TaskAttemptImpl#ContainerAssignedTransition#transition by judging whether the
assigned container's host is within the local host/rack list of the task
attempt.
Right. I think you've got these, as well as the other changes mapped to the
correct places for trunk.
Comments on the patch
- Some lines exceed the 80 column limit. (Coding style guidelines at
http://wiki.apache.org/hadoop/CodeReviewChecklist)
- Job specific configuration strings like "mapreduce.workflow.id" etc should be
in MRJobConfig
- TaskAttemptImpl - Default locality set to NODE_LOCAL. Should be OFF_SWITCH to
match branch-1.
- In TaskAttemptImpl, avoid resolving the host names multiple times.
- In TaskImpl, SPECULATIVE should be set when the RedundantScheduleTransition
is taken. Not for retires caused by FAILED / KILLED attempts. The same likely
applies to the branch-1 patch.
- In the history events (JobSubmittedEvent etc) - null check on the new
strings. (the Utf8 constructor does not work with nulls)
- The toString implementation in the history events, as well as the Rumen
events is not really needed. If implemented, they should include additional
fields.
The additional information may be useful to expose - via the UI at least - in
which case it'll need to be exposed via TaskAttemptReport. This can be done in
a follow up jira. For now, Locality and Avatar enums could be moved to mrv2 -
the hadoop-mapreduce-client-common module (ref TaskType).
> Add extra info to JH files
> --------------------------
>
> Key: MAPREDUCE-4838
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4838
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Reporter: Arun C Murthy
> Assignee: Zhijie Shen
> Attachments: MAPREDUCE-4838_1.patch, MAPREDUCE-4838.patch
>
>
> It will be useful to add more task-info to JH for analytics.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira