[jira] [Updated] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-06-01 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-8372:
---
Attachment: YARN-8372.3.patch

> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-8372.1.patch, YARN-8372.2.patch, YARN-8372.3.patch
>
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application 
> attempt fails due to a NM restart (NM where the DS AM is running), an 
> ApplicationAttemptNotFoundException is thrown and all containers for that 
> attempt including the ones that are running on other NMs are killed by the AM 
> and marked as COMPLETE. The subsequent attempt spawns new containers just 
> like a new attempt. This behavior is different to a Map Reduce application 
> where the containers are not killed.
> cc [~rohithsharma]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-31 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-8372:
---
Attachment: YARN-8372.2.patch

> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-8372.1.patch, YARN-8372.2.patch
>
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application 
> attempt fails due to a NM restart (NM where the DS AM is running), an 
> ApplicationAttemptNotFoundException is thrown and all containers for that 
> attempt including the ones that are running on other NMs are killed by the AM 
> and marked as COMPLETE. The subsequent attempt spawns new containers just 
> like a new attempt. This behavior is different to a Map Reduce application 
> where the containers are not killed.
> cc [~rohithsharma]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-30 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-8372:
---
Attachment: YARN-8372.1.patch

> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-8372.1.patch
>
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application 
> attempt fails due to a NM restart (NM where the DS AM is running), an 
> ApplicationAttemptNotFoundException is thrown and all containers for that 
> attempt including the ones that are running on other NMs are killed by the AM 
> and marked as COMPLETE. The subsequent attempt spawns new containers just 
> like a new attempt. This behavior is different to a Map Reduce application 
> where the containers are not killed.
> cc [~rohithsharma]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-28 Thread Charan Hebri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charan Hebri updated YARN-8372:
---
Description: 
{noformat}
try {
  response = client.allocate(progress);
} catch (ApplicationAttemptNotFoundException e) {
handler.onShutdownRequest();
LOG.info("Shutdown requested. Stopping callback.");
return;{noformat}
is a code snippet from AMRMClientAsyncImpl. The corresponding onShutdownRequest 
call for the Distributed Shell App master,
{noformat}
@Override
public void onShutdownRequest() {
  done = true;
}{noformat}
Due to the above change, the current behavior is that whenever an application 
attempt fails due to a NM restart (NM where the DS AM is running), an 
ApplicationAttemptNotFoundException is thrown and all containers for that 
attempt including the ones that are running on other NMs are killed by the AM 
and marked as COMPLETE. The subsequent attempt spawns new containers just like 
a new attempt. This behavior is different to a Map Reduce application where the 
containers are not killed.
cc [~rohithsharma]

  was:
{noformat}
try {
  response = client.allocate(progress);
} catch (ApplicationAttemptNotFoundException e) {
handler.onShutdownRequest();
LOG.info("Shutdown requested. Stopping callback.");
return;{noformat}
is a code snippet from AMRMClientAsyncImpl. The corresponding onShutdownRequest 
call for the Distributed Shell App master is,
{noformat}
@Override
public void onShutdownRequest() {
  done = true;
}{noformat}
Due to this, the current behavior is that whenever an application attempt fails 
due to a NM restart (where the DS AM is running), an 


> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Priority: Major
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to the above change, the current behavior is that whenever an application 
> attempt fails due to a NM restart (NM where the DS AM is running), an 
> ApplicationAttemptNotFoundException is thrown and all containers for that 
> attempt including the ones that are running on other NMs are killed by the AM 
> and marked as COMPLETE. The subsequent attempt spawns new containers just 
> like a new attempt. This behavior is different to a Map Reduce application 
> where the containers are not killed.
> cc [~rohithsharma]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8372) ApplicationAttemptNotFoundException should be handled correctly by Distributed Shell App Master

2018-05-28 Thread Charan Hebri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charan Hebri updated YARN-8372:
---
Description: 
{noformat}
try {
  response = client.allocate(progress);
} catch (ApplicationAttemptNotFoundException e) {
handler.onShutdownRequest();
LOG.info("Shutdown requested. Stopping callback.");
return;{noformat}
is a code snippet from AMRMClientAsyncImpl. The corresponding onShutdownRequest 
call for the Distributed Shell App master is,
{noformat}
@Override
public void onShutdownRequest() {
  done = true;
}{noformat}
Due to this, the current behavior is that whenever an application attempt fails 
due to a NM restart (where the DS AM is running), an 

  was:
{noformat}
try {
  response = client.allocate(progress);
} catch (ApplicationAttemptNotFoundException e) {
  handler.onShutdownRequest();
  LOG.info("Shutdown requested. Stopping callback.");
  return;{noformat}


> ApplicationAttemptNotFoundException should be handled correctly by 
> Distributed Shell App Master
> ---
>
> Key: YARN-8372
> URL: https://issues.apache.org/jira/browse/YARN-8372
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Reporter: Charan Hebri
>Priority: Major
>
> {noformat}
> try {
>   response = client.allocate(progress);
> } catch (ApplicationAttemptNotFoundException e) {
> handler.onShutdownRequest();
> LOG.info("Shutdown requested. Stopping callback.");
> return;{noformat}
> is a code snippet from AMRMClientAsyncImpl. The corresponding 
> onShutdownRequest call for the Distributed Shell App master is,
> {noformat}
> @Override
> public void onShutdownRequest() {
>   done = true;
> }{noformat}
> Due to this, the current behavior is that whenever an application attempt 
> fails due to a NM restart (where the DS AM is running), an 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org