[ https://issues.apache.org/jira/browse/YARN-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034746#comment-14034746 ]
Tsuyoshi OZAWA commented on YARN-2052: -------------------------------------- {quote} We should make it a long in the same release as the epoch number addition so that we dont have to worry about that. {quote} +1 to do this in the same release. We'll plan to do the improvement on another JIRA. It's OK, but I think it's important for us that we decide the behavior when the overflow happens. We have 2 options: just aborting RM for now or starting apps from a clean state after the restart. We're planning to make id long just after this JIRA, so we can take aborting approach to prevent unexpected behavior for the simplicity. [~bikassaha], [~jianhe], what do you think about this? > ContainerId creation after work preserving restart is broken > ------------------------------------------------------------ > > Key: YARN-2052 > URL: https://issues.apache.org/jira/browse/YARN-2052 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Tsuyoshi OZAWA > Assignee: Tsuyoshi OZAWA > Attachments: YARN-2052.1.patch, YARN-2052.2.patch, YARN-2052.3.patch > > > Container ids are made unique by using the app identifier and appending a > monotonically increasing sequence number to it. Since container creation is a > high churn activity the RM does not store the sequence number per app. So > after restart it does not know what the new sequence number should be for new > allocations. -- This message was sent by Atlassian JIRA (v6.2#6252)