[ https://issues.apache.org/jira/browse/YARN-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834326#comment-15834326 ]
Naganarasimha G R commented on YARN-6114: ----------------------------------------- Alternatively we can just remove containers which are allocated(got from scheduler) but also present in justfinished containers (RMAppAttemptImpl) before sending to AM in the AM hearbeat response. > RM will report unacquired containers to AM if they are KILLED before being > ACQUIRED > ----------------------------------------------------------------------------------- > > Key: YARN-6114 > URL: https://issues.apache.org/jira/browse/YARN-6114 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Reporter: Varun Saxena > Assignee: Varun Saxena > > RM will report unacquired containers to AM if they are KILLED (due to node > becoming unhealthy or reconnecting, etc.), before being ACQUIRED . > This is because we report all containers whether they have been acquired or > not to RMAppAttempt and they are further added to the list of just finished > containers. > These containers are then reported on next AM heartbeat even though they have > not been acquired by the AM, which is unnecessary. > In case of Spark AM, this leads to the AM re-requesting these containers > again from RM. > To fix this, we can add a flag in the event which originates from scheduler, > to indicate that the container was not acquired so that this flag can be > checked and the container is not added to the list of just finished > containers in RMAppAttemptImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org