[
https://issues.apache.org/jira/browse/HADOOP-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632508#action_12632508
]
Amar Kamat commented on HADOOP-4209:
------------------------------------
Owen,
The timestamp is introduced for a different purpose. Consider a case where the
JT runs for the first time and schedules an attempt from a tip tip_0. Let the
attempt be attempt_0_0. Let assume that the jobtracker crashes while
attempt_0_0 is still running. Upon restart, the JT might not have seen the
attempt as its not logged to the history and so will go ahead and schedule
another attempt from tip_0. Now this attempt will also have the same id as the
earlier attempt (before restart) and hence the side-effect files will now
clash. Consider a case where the attempt_0_0 is done but not logged. In such a
case the same tracker can now ask for an attempt from tip_0 and might get
attempt_0_0. Now not only the side effect files but also the local directories
will clash. To avoid such issues we have kept the attempt id unique across
restarts. This timestamp is redundant only if the jobtracker never restarts.
> The TaskAttemptID should not have the JobTracker start time
> -----------------------------------------------------------
>
> Key: HADOOP-4209
> URL: https://issues.apache.org/jira/browse/HADOOP-4209
> Project: Hadoop Core
> Issue Type: Bug
> Reporter: Owen O'Malley
> Priority: Blocker
> Fix For: 0.19.0
>
>
> The TaskAttemptID now includes the redundant copy of the JobTracker's start
> time as milliseconds. We should instead change the JobID to have the longer
> unique string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.