[ https://issues.apache.org/jira/browse/YARN-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14490484#comment-14490484 ]
Jian He commented on YARN-3387: ------------------------------- [~sandflee], thanks for the patch ! The newly added shareStateWithCurrentAttempt is the same as transferStateFromPreviousAttempt ? we can just use the latter and may rename it to transferStateFromAttempt. Could you add a test case too? TestWorkPreservingRMRestart has some example tests. > container complete message couldn't pass to am if am restarted and rm changed > ----------------------------------------------------------------------------- > > Key: YARN-3387 > URL: https://issues.apache.org/jira/browse/YARN-3387 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.6.0 > Reporter: sandflee > Priority: Critical > Labels: patch > Attachments: YARN-3387.001.patch > > > suppose am work preserving and rm ha is enabled. > container complete message is passed to appattemt.justFinishedContainers in > rm。in normal situation,all attempt in one app shares the same > justFinishedContainers, but when rm changed, every attempt has it's own > justFinishedContainers, so in situations below, container complete message > couldn't passed to am: > 1, am restart > 2, rm changes > 3, container launched by first am completes > container complete message will be passed to appAttempt1 not appAttempt2, but > am pull finished containers from appAttempt2 (currentAppAttempt) -- This message was sent by Atlassian JIRA (v6.3.4#6332)