[ https://issues.apache.org/jira/browse/MAPREDUCE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798635#action_12798635 ]
Hemanth Yamijala commented on MAPREDUCE-1316: --------------------------------------------- Some minor comments on the main patch: - I would recommend we log removal of a task attempt mapping in removeTaskEntry only if something is removed. Otherwise, I see there will be duplicate log lines and this might be heavy on the logs. - I would suggest introducing a package private API like JobInProgress.getAllTasks() which returns all the tasks - like setup, cleanup, maps and reduces. This helps cut duplicate code in removeJobTasks and would also be useful if task types change in future. - Rather than enumerate the task types in the javadoc of getTaskType, I would suggest we expand on the details of what is returned without actually enumerating all the task types. Basically, it is clear this API should return the task type of a task, depending on the nature of the task rather than on the task attempt id. This way we wouldn't need to worry too much about keeping the comment in sync. - "Get task-attempt-ids for all the tasks." - Seems incorrect for getAllTaskIDs. It should be "Get all {...@link TaskAttemptID}s for a given {...@link TaskInProgress}". In fact, maybe as you suggested the API can also be changed to getAllTaskAttemptIDs to be more correct. - I am a little worried we are returning the data structure 'tasks' directly to the caller of getAllTaskIDs. Primarily I am worried that this is a modifiable collection. Should we make it a copy or maybe return an array of these objects like we do for getTaskStatuses() ? > JobTracker holds stale references to retired jobs via unreported tasks > ----------------------------------------------------------------------- > > Key: MAPREDUCE-1316 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1316 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Reporter: Amar Kamat > Assignee: Amar Kamat > Priority: Blocker > Attachments: mapreduce-1316-v1.11.patch, mapreduce-1316-v1.7.patch > > > JobTracker fails to remove _unreported_ tasks' mapping from _taskToTIPMap_ if > the job finishes and retires. _Unreported tasks_ refers to tasks that were > scheduled but the tasktracker did not report back with the task status. In > such cases a stale reference is held to TaskInProgress (and thus > JobInProgress) long after the job is gone leading to memory leak. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.