Yesha Vora created YARN-4502: -------------------------------- Summary: Sometimes Two AM containers get launched Key: YARN-4502 URL: https://issues.apache.org/jira/browse/YARN-4502 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora Priority: Critical
Scenario : * set yarn.resourcemanager.am.max-attempts = 2 * start dshell application {code} yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar hadoop-yarn-applications-distributedshell-*.jar -attempt_failures_validity_interval 60000 -shell_command "sleep 150" -num_containers 16 {code} * Kill AM pid * Print container list for 2nd attempt {code} yarn container -list appattempt_1450825622869_0001_000002 INFO impl.TimelineClientImpl: Timeline service address: http://xxx:port/ws/v1/timeline/ INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10:<port> Total number of containers :2 Container-Id Start Time Finish Time State Host Node Http Address LOG-URL container_e12_1450825622869_0001_02_000002 Tue Dec 22 23:07:35 +0000 2015 N/A RUNNING xxx:25454 http://xxx:8042 http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_000002/hrt_qa container_e12_1450825622869_0001_02_000001 Tue Dec 22 23:07:34 +0000 2015 N/A RUNNING xxx:25454 http://xxx:8042 http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_000001/hrt_qa {code} * look for new AM pid Here, 2nd AM container was suppose to be started on container_e12_1450825622869_0001_02_000001. But AM was not launched on container_e12_1450825622869_0001_02_000001. It was in AQUIRED state. On other hand, container_e12_1450825622869_0001_02_000002 got the AM running. Expected behavior: RM should not start 2 containers for starting AM -- This message was sent by Atlassian JIRA (v6.3.4#6332)