[ https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105191#comment-15105191 ]
Rohith Sharma K S commented on YARN-4502: ----------------------------------------- branch-2/branch-2.8 was broken by YARN-4265, uploaded addendum patch and committed it. Most probably, guessing in your case that compilation is done for trunk first where in 3.0.0.SNAPSHOT jar has got published to local repository. And later trying to compile in branch-2/branch-2.8 fetches jars from local repository. > Fix two AM containers get allocated when AM restart > --------------------------------------------------- > > Key: YARN-4502 > URL: https://issues.apache.org/jira/browse/YARN-4502 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Yesha Vora > Assignee: Vinod Kumar Vavilapalli > Priority: Critical > Fix For: 2.8.0 > > Attachments: YARN-4502-20160114.txt, YARN-4502-20160212.txt > > > Scenario : > * set yarn.resourcemanager.am.max-attempts = 2 > * start dshell application > {code} > yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar > hadoop-yarn-applications-distributedshell-*.jar > -attempt_failures_validity_interval 60000 -shell_command "sleep 150" > -num_containers 16 > {code} > * Kill AM pid > * Print container list for 2nd attempt > {code} > yarn container -list appattempt_1450825622869_0001_000002 > INFO impl.TimelineClientImpl: Timeline service address: > http://xxx:port/ws/v1/timeline/ > INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10:<port> > Total number of containers :2 > Container-Id Start Time Finish Time > State Host Node Http Address > LOG-URL > container_e12_1450825622869_0001_02_000002 Tue Dec 22 23:07:35 +0000 2015 > N/A RUNNING xxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_000002/hrt_qa > container_e12_1450825622869_0001_02_000001 Tue Dec 22 23:07:34 +0000 2015 > N/A RUNNING xxx:25454 http://xxx:8042 > http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_000001/hrt_qa > {code} > * look for new AM pid > Here, 2nd AM container was suppose to be started on > container_e12_1450825622869_0001_02_000001. But AM was not launched on > container_e12_1450825622869_0001_02_000001. It was in AQUIRED state. > On other hand, container_e12_1450825622869_0001_02_000002 got the AM running. > Expected behavior: RM should not start 2 containers for starting AM -- This message was sent by Atlassian JIRA (v6.3.4#6332)