[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798635#action_12798635
 ] 

Hemanth Yamijala commented on MAPREDUCE-1316:
---------------------------------------------

Some minor comments on the main patch:

- I would recommend we log removal of a task attempt mapping in removeTaskEntry 
only if something is removed. Otherwise, I see there will be duplicate log 
lines and this might be heavy on the logs.
- I would suggest introducing a package private API like 
JobInProgress.getAllTasks() which returns all the tasks - like setup, cleanup, 
maps and reduces. This helps cut duplicate code in removeJobTasks and would 
also be useful if task types change in future.
- Rather than enumerate the task types in the javadoc of getTaskType, I would 
suggest we expand on the details of what is returned without actually 
enumerating all the task types. Basically, it is clear this API should return 
the task type of a task, depending on the nature of the task rather than on the 
task attempt id. This way we wouldn't need to worry too much about keeping the 
comment in sync.
- "Get task-attempt-ids for all the tasks." - Seems incorrect for 
getAllTaskIDs. It should be "Get all {...@link TaskAttemptID}s for a given 
{...@link TaskInProgress}". In fact, maybe as you suggested the API can also be 
changed to getAllTaskAttemptIDs to be more correct.
- I am a little worried we are returning the data structure 'tasks' directly to 
the caller of getAllTaskIDs. Primarily I am worried that this is a modifiable 
collection. Should we make it a copy or maybe return an array of these objects 
like we do for getTaskStatuses() ?

> JobTracker holds stale references to retired jobs via unreported tasks 
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1316
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1316
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: mapreduce-1316-v1.11.patch, mapreduce-1316-v1.7.patch
>
>
> JobTracker fails to remove _unreported_ tasks' mapping from _taskToTIPMap_ if 
> the job finishes and retires. _Unreported tasks_ refers to tasks that were 
> scheduled but the tasktracker did not report back with the task status. In 
> such cases a stale reference is held to TaskInProgress (and thus 
> JobInProgress) long after the job is gone leading to memory leak.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to