[jira] [Updated] (YARN-4502) Fix two AM containers get allocated when AM restart

2016-02-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4502:
--
Target Version/s: 2.7.3, 2.6.5  (was: 2.6.5)

> Fix two AM containers get allocated when AM restart
> ---
>
> Key: YARN-4502
> URL: https://issues.apache.org/jira/browse/YARN-4502
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-4502-20160114.txt, YARN-4502-20160212.txt
>
>
> Scenario : 
> * set yarn.resourcemanager.am.max-attempts = 2
> * start dshell application
> {code}
>  yarn  org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> hadoop-yarn-applications-distributedshell-*.jar 
> -attempt_failures_validity_interval 6 -shell_command "sleep 150" 
> -num_containers 16
> {code}
> * Kill AM pid
> * Print container list for 2nd attempt
> {code}
> yarn container -list appattempt_1450825622869_0001_02
> INFO impl.TimelineClientImpl: Timeline service address: 
> http://xxx:port/ws/v1/timeline/
> INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10:
> Total number of containers :2
> Container-Id Start Time Finish Time   
> StateHost   Node Http Address 
>LOG-URL
> container_e12_1450825622869_0001_02_02 Tue Dec 22 23:07:35 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_02/hrt_qa
> container_e12_1450825622869_0001_02_01 Tue Dec 22 23:07:34 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_01/hrt_qa
> {code}
> * look for new AM pid 
> Here, 2nd AM container was suppose to be started on  
> container_e12_1450825622869_0001_02_01. But AM was not launched on 
> container_e12_1450825622869_0001_02_01. It was in AQUIRED state. 
> On other hand, container_e12_1450825622869_0001_02_02 got the AM running. 
> Expected behavior: RM should not start 2 containers for starting AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4502) Fix two AM containers get allocated when AM restart

2016-02-02 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4502:
-
Target Version/s: 2.6.5  (was: 2.6.4)

> Fix two AM containers get allocated when AM restart
> ---
>
> Key: YARN-4502
> URL: https://issues.apache.org/jira/browse/YARN-4502
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-4502-20160114.txt, YARN-4502-20160212.txt
>
>
> Scenario : 
> * set yarn.resourcemanager.am.max-attempts = 2
> * start dshell application
> {code}
>  yarn  org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> hadoop-yarn-applications-distributedshell-*.jar 
> -attempt_failures_validity_interval 6 -shell_command "sleep 150" 
> -num_containers 16
> {code}
> * Kill AM pid
> * Print container list for 2nd attempt
> {code}
> yarn container -list appattempt_1450825622869_0001_02
> INFO impl.TimelineClientImpl: Timeline service address: 
> http://xxx:port/ws/v1/timeline/
> INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10:
> Total number of containers :2
> Container-Id Start Time Finish Time   
> StateHost   Node Http Address 
>LOG-URL
> container_e12_1450825622869_0001_02_02 Tue Dec 22 23:07:35 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_02/hrt_qa
> container_e12_1450825622869_0001_02_01 Tue Dec 22 23:07:34 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_01/hrt_qa
> {code}
> * look for new AM pid 
> Here, 2nd AM container was suppose to be started on  
> container_e12_1450825622869_0001_02_01. But AM was not launched on 
> container_e12_1450825622869_0001_02_01. It was in AQUIRED state. 
> On other hand, container_e12_1450825622869_0001_02_02 got the AM running. 
> Expected behavior: RM should not start 2 containers for starting AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4502) Fix two AM containers get allocated when AM restart

2016-02-02 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-4502:

Summary: Fix two AM containers get allocated when AM restart  (was: 
gjfbndbfcjenrgccriejuvcnktllcc)

> Fix two AM containers get allocated when AM restart
> ---
>
> Key: YARN-4502
> URL: https://issues.apache.org/jira/browse/YARN-4502
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-4502-20160114.txt, YARN-4502-20160212.txt
>
>
> Scenario : 
> * set yarn.resourcemanager.am.max-attempts = 2
> * start dshell application
> {code}
>  yarn  org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> hadoop-yarn-applications-distributedshell-*.jar 
> -attempt_failures_validity_interval 6 -shell_command "sleep 150" 
> -num_containers 16
> {code}
> * Kill AM pid
> * Print container list for 2nd attempt
> {code}
> yarn container -list appattempt_1450825622869_0001_02
> INFO impl.TimelineClientImpl: Timeline service address: 
> http://xxx:port/ws/v1/timeline/
> INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10:
> Total number of containers :2
> Container-Id Start Time Finish Time   
> StateHost   Node Http Address 
>LOG-URL
> container_e12_1450825622869_0001_02_02 Tue Dec 22 23:07:35 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_02/hrt_qa
> container_e12_1450825622869_0001_02_01 Tue Dec 22 23:07:34 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_01/hrt_qa
> {code}
> * look for new AM pid 
> Here, 2nd AM container was suppose to be started on  
> container_e12_1450825622869_0001_02_01. But AM was not launched on 
> container_e12_1450825622869_0001_02_01. It was in AQUIRED state. 
> On other hand, container_e12_1450825622869_0001_02_02 got the AM running. 
> Expected behavior: RM should not start 2 containers for starting AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4502) Fix two AM containers get allocated when AM restart

2016-01-18 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4502:
-
Fix Version/s: (was: 2.9.0)
   2.8.0

> Fix two AM containers get allocated when AM restart
> ---
>
> Key: YARN-4502
> URL: https://issues.apache.org/jira/browse/YARN-4502
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-4502-20160114.txt, YARN-4502-20160212.txt
>
>
> Scenario : 
> * set yarn.resourcemanager.am.max-attempts = 2
> * start dshell application
> {code}
>  yarn  org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> hadoop-yarn-applications-distributedshell-*.jar 
> -attempt_failures_validity_interval 6 -shell_command "sleep 150" 
> -num_containers 16
> {code}
> * Kill AM pid
> * Print container list for 2nd attempt
> {code}
> yarn container -list appattempt_1450825622869_0001_02
> INFO impl.TimelineClientImpl: Timeline service address: 
> http://xxx:port/ws/v1/timeline/
> INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10:
> Total number of containers :2
> Container-Id Start Time Finish Time   
> StateHost   Node Http Address 
>LOG-URL
> container_e12_1450825622869_0001_02_02 Tue Dec 22 23:07:35 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_02/hrt_qa
> container_e12_1450825622869_0001_02_01 Tue Dec 22 23:07:34 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_01/hrt_qa
> {code}
> * look for new AM pid 
> Here, 2nd AM container was suppose to be started on  
> container_e12_1450825622869_0001_02_01. But AM was not launched on 
> container_e12_1450825622869_0001_02_01. It was in AQUIRED state. 
> On other hand, container_e12_1450825622869_0001_02_02 got the AM running. 
> Expected behavior: RM should not start 2 containers for starting AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4502) Fix two AM containers get allocated when AM restart

2016-01-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4502:
-
Summary: Fix two AM containers get allocated when AM restart  (was: 
Sometimes Two AM containers get launched)

> Fix two AM containers get allocated when AM restart
> ---
>
> Key: YARN-4502
> URL: https://issues.apache.org/jira/browse/YARN-4502
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yesha Vora
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
> Attachments: YARN-4502-20160114.txt, YARN-4502-20160212.txt
>
>
> Scenario : 
> * set yarn.resourcemanager.am.max-attempts = 2
> * start dshell application
> {code}
>  yarn  org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> hadoop-yarn-applications-distributedshell-*.jar 
> -attempt_failures_validity_interval 6 -shell_command "sleep 150" 
> -num_containers 16
> {code}
> * Kill AM pid
> * Print container list for 2nd attempt
> {code}
> yarn container -list appattempt_1450825622869_0001_02
> INFO impl.TimelineClientImpl: Timeline service address: 
> http://xxx:port/ws/v1/timeline/
> INFO client.RMProxy: Connecting to ResourceManager at xxx/10.10.10.10:
> Total number of containers :2
> Container-Id Start Time Finish Time   
> StateHost   Node Http Address 
>LOG-URL
> container_e12_1450825622869_0001_02_02 Tue Dec 22 23:07:35 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_02/hrt_qa
> container_e12_1450825622869_0001_02_01 Tue Dec 22 23:07:34 + 2015 
>   N/A RUNNINGxxx:25454   http://xxx:8042 
> http://xxx:8042/node/containerlogs/container_e12_1450825622869_0001_02_01/hrt_qa
> {code}
> * look for new AM pid 
> Here, 2nd AM container was suppose to be started on  
> container_e12_1450825622869_0001_02_01. But AM was not launched on 
> container_e12_1450825622869_0001_02_01. It was in AQUIRED state. 
> On other hand, container_e12_1450825622869_0001_02_02 got the AM running. 
> Expected behavior: RM should not start 2 containers for starting AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)