[jira] [Created] (YARN-10825) Yarn Service containers not getting killed after NM shutdown

2021-06-17 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10825:
---

 Summary: Yarn Service containers not getting killed after NM 
shutdown
 Key: YARN-10825
 URL: https://issues.apache.org/jira/browse/YARN-10825
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.1.1
Reporter: Sushanta Sen


When yarn.nodemanager.recovery.supervised is enabled and NM is shutdown, the 
new containers are getting launched after the RM sends the node lost event to 
AM, but the existing containers on the lost node are not getting killed. The 
issue has occurred only for yarn service. For Normal jobs the behavior is 
working fine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10684) YARN: Opportunistic Container :: Distributed YARN Job has Failed when tried adding flag -promote_opportunistic_after_start

2021-03-22 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10684:

Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : :

Job Command :: Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC -*promote_opportunistic_after_start*

Actual Result: Distributed Shell Yarn Job Failed almost all times with below 
Diagnostics message

*[ Failed Reason : Application Failure: desired = 10, completed = 10, allocated 
= 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container Killed to 
make room for Guaranteed Container.]*

Expected Result: DS job should be successful with argument 
"promote_opportunistic_after_start"  **  ** 

  was:
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : :

Job Command :: yarn org.apache.hadoop.yarn.applications.distributedshell.Client 
-jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 10 -container_type 
OPPORTUNISTIC -*promote_opportunistic_after_start*

Actual Result: Distributed Shell Yarn Job Failed almost all times with below 
Diagnostics message

*[ Failed Reason : Application Failure: desired = 10, completed = 10, allocated 
= 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container Killed to 
make room for Guaranteed Container.]*

Expected Result: DS job should be successful with argument 
"promote_opportunistic_after_start" * ** *


> YARN: Opportunistic Container :: Distributed YARN Job has Failed when tried 
> adding flag -promote_opportunistic_after_start 
> ---
>
> Key: YARN-10684
> URL: https://issues.apache.org/jira/browse/YARN-10684
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-scheduling
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM yarn-site.xml ::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]yarn-site.xml ::: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : :
> Job Command :: Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC -*promote_opportunistic_after_start*
> Actual Result: Distributed Shell Yarn Job Failed almost all times with below 
> Diagnostics message
> *[ Failed Reason : Application Failure: desired = 10, completed = 10, 
> allocated = 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container 
> Killed to make room for Guaranteed Container.]*
> Expected Result: DS job should be successful with argument 
> "promote_opportunistic_after_start"  **  ** 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-03-22 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10670:

Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.

  was:
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.


> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM yarn-site.xml ::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]yarn-site.xml ::: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Attempt recovered after RM restartApplication Failure: desired = 20, 
> completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
> 22:11:48.440]Container De-queued to meet NM queuing limits.
> [2021-02-09 22:11:48.441]Container terminated before launch.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-03-22 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10670:

Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.

  was:
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.


> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM yarn-site.xml ::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]yarn-site.xml ::: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-*.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Attempt recovered after RM restartApplication Failure: desired = 20, 
> completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
> 22:11:48.440]Container De-queued to meet NM queuing limits.
> [2021-02-09 22:11:48.441]Container terminated before launch.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10684) YARN: Opportunistic Container :: Distributed YARN Job has Failed when tried adding flag -promote_opportunistic_after_start

2021-03-09 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10684:

Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : :

Job Command :: yarn org.apache.hadoop.yarn.applications.distributedshell.Client 
-jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 10 -container_type 
OPPORTUNISTIC -*promote_opportunistic_after_start*

Actual Result: Distributed Shell Yarn Job Failed almost all times with below 
Diagnostics message

*[ Failed Reason : Application Failure: desired = 10, completed = 10, allocated 
= 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container Killed to 
make room for Guaranteed Container.]*

Expected Result: DS job should be successful with argument 
"promote_opportunistic_after_start" * ** *

  was:
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
yarn.resourcemanager.opportunistic-container-allocation.enabled
true

 # Set this in NM[s]yarn-site.xml ::: 
yarn.nodemanager.opportunistic-containers-max-queue-length
30


 
Test Steps:

Job Command : :

Job Command :: yarn org.apache.hadoop.yarn.applications.distributedshell.Client 
-jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 10 -container_type 
OPPORTUNISTIC -*promote_opportunistic_after_start*

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message

*[ Failed Reason : Application Failure: desired = 10, completed = 10, allocated 
= 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container Killed to 
make room for Guaranteed Container.]*

Expected Result: DS job should be successful with argument 
"promote_opportunistic_after_start" ** **


> YARN: Opportunistic Container :: Distributed YARN Job has Failed when tried 
> adding flag -promote_opportunistic_after_start 
> ---
>
> Key: YARN-10684
> URL: https://issues.apache.org/jira/browse/YARN-10684
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-scheduling
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM yarn-site.xml ::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]yarn-site.xml ::: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : :
> Job Command :: yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  -shell_command sleep -shell_args 20 -num_containers 10 -container_type 
> OPPORTUNISTIC -*promote_opportunistic_after_start*
> Actual Result: Distributed Shell Yarn Job Failed almost all times with below 
> Diagnostics message
> *[ Failed Reason : Application Failure: desired = 10, completed = 10, 
> allocated = 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container 
> Killed to make room for Guaranteed Container.]*
> Expected Result: DS job should be successful with argument 
> "promote_opportunistic_after_start" * ** *



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10684) YARN: Opportunistic Container :: Distributed YARN Job has Failed when tried adding flag -promote_opportunistic_after_start

2021-03-09 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10684:
---

 Summary: YARN: Opportunistic Container :: Distributed YARN Job has 
Failed when tried adding flag -promote_opportunistic_after_start 
 Key: YARN-10684
 URL: https://issues.apache.org/jira/browse/YARN-10684
 Project: Hadoop YARN
  Issue Type: Bug
  Components: distributed-scheduling
Affects Versions: 3.1.1
Reporter: Sushanta Sen


Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
yarn.resourcemanager.opportunistic-container-allocation.enabled
true

 # Set this in NM[s]yarn-site.xml ::: 
yarn.nodemanager.opportunistic-containers-max-queue-length
30


 
Test Steps:

Job Command : :

Job Command :: yarn org.apache.hadoop.yarn.applications.distributedshell.Client 
-jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 10 -container_type 
OPPORTUNISTIC -*promote_opportunistic_after_start*

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message

*[ Failed Reason : Application Failure: desired = 10, completed = 10, allocated 
= 10, failed = 2, diagnostics = [2021-02-10 00:00:27.640]Container Killed to 
make room for Guaranteed Container.]*

Expected Result: DS job should be successful with argument 
"promote_opportunistic_after_start" ** **



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-03-04 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10670:

Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.

  was:
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.


> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM yarn-site.xml ::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]yarn-site.xml ::: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Attempt recovered after RM restartApplication Failure: desired = 20, 
> completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
> 22:11:48.440]Container De-queued to meet NM queuing limits.
> [2021-02-09 22:11:48.441]Container terminated before launch.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-03-04 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10670:

Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.

  was:
Preconditions:
 # Secure Hadoop 3.1.1 c3 Nodes cluster is installed
 # Set the below parameters  in RM::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.


> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Attempt recovered after RM restartApplication Failure: desired = 20, 
> completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
> 22:11:48.440]Container De-queued to meet NM queuing limits.
> [2021-02-09 22:11:48.441]Container terminated before launch.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-03-04 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10670:

Description: 
Preconditions:
 # Secure Hadoop 3.1.1 c3 Nodes cluster is installed
 # Set the below parameters  in RM::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.

  was:
Preconditions:
 # Secure Hadoop 3.1.1 c3 Nodes cluster is installed
 # Set the below parameters  in RM::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
Test Steps:


Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message

{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}

 


> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 c3 Nodes cluster is installed
>  # Set the below parameters  in RM::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Attempt recovered after RM restartApplication Failure: desired = 20, 
> completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
> 22:11:48.440]Container De-queued to meet NM queuing limits.
> [2021-02-09 22:11:48.441]Container terminated before launch.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-03-04 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10670:
---

 Summary: YARN: Opportunistic Container : : In distributed shell 
job if containers are killed then application is failed. But in this case as 
containers are killed to make room for guaranteed containers which is not 
correct to fail an application
 Key: YARN-10670
 URL: https://issues.apache.org/jira/browse/YARN-10670
 Project: Hadoop YARN
  Issue Type: Bug
  Components: distributed-shell
Affects Versions: 3.1.1
Reporter: Sushanta Sen


Preconditions:
 # Secure Hadoop 3.1.1 c3 Nodes cluster is installed
 # Set the below parameters  in RM::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
Test Steps:


Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message

{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10669) Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN on RM switch and TS restart

2021-03-03 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10669:

Affects Version/s: 3.1.1

> Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN on RM switch and TS 
> restart
> --
>
> Key: YARN-10669
> URL: https://issues.apache.org/jira/browse/YARN-10669
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.1.1
> Environment: 3 Nodes Hadoop Secure cluster with 3.1.1 version
>Reporter: Sushanta Sen
>Priority: Major
>
> Using delegation token rather than the keytab of the user when submitting job 
> to yarn. 
> And this config yarn.timeline-service.enabled = true.
> So addTimelineDelegationToken will be executed. My Job has submitted 
> successfully, but the question is my job failed when I Switched RM and TS 
> restart because TIMELINE_DELEGATION_TOKEN renew failed. 
> Only RM switch and TS restart will reproduce the issue.
> RM log snippet below:
> {noformat}
> 2020-12-02 17:37:21,268 | WARN  | DelegationTokenRenewer #3402 | Unable to 
> add the application to the delegation token renewer. | 
> DelegationTokenRenewer.java:949
> java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, 
> Service: 192.168.0.2:8190, Ident: (TIMELINE_DELEGATION_TOKEN owner=bnn, 
> renewer=mapred, realUser=executor, issueDate=1606880472758, 
> maxDate=1607485272758, sequenceNumber=11581, masterKeyId=13)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:508)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$1100(DelegationTokenRenewer.java:80)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:945)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:922)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: HTTP status [403], message 
> [org.apache.hadoop.security.token.SecretManager$InvalidToken: Unable to find 
> master key for keyId=13 from cache. Failed to renew an unexpired token 
> (TIMELINE_DELEGATION_TOKEN owner=bnn, renewer=mapred, realUser=executor, 
> issueDate=1606880472758, maxDate=1607485272758, sequenceNumber=11581, 
> masterKeyId=13) with sequenceNumber=11581]
> at 
> org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:174)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:323)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:239)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:426)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:247)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:227)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientRetryOpForOperateDelegationToken.run(TimelineConnector.java:431)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:334)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector.operateDelegationToken(TimelineConnector.java:218)
> at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.renewDelegationToken(TimelineClientImpl.java:250)
> at 
> org.apache.hadoop.yarn.security.client.TimelineDelegationTokenIdentifier$Renewer.renew(TimelineDelegationTokenIdentifier.java:81)
> at org.apache.hadoop.security.token.Token.renew(Token.java:490)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:634)
> at 
> 

[jira] [Created] (YARN-10669) Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN on RM switch and TS restart

2021-03-03 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10669:
---

 Summary: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN on 
RM switch and TS restart
 Key: YARN-10669
 URL: https://issues.apache.org/jira/browse/YARN-10669
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineservice
 Environment: 3 Nodes Hadoop Secure cluster with 3.1.1 version
Reporter: Sushanta Sen


Using delegation token rather than the keytab of the user when submitting job 
to yarn. 
And this config yarn.timeline-service.enabled = true.
So addTimelineDelegationToken will be executed. My Job has submitted 
successfully, but the question is my job failed when I Switched RM and TS 
restart because TIMELINE_DELEGATION_TOKEN renew failed. 
Only RM switch and TS restart will reproduce the issue.

RM log snippet below:

{noformat}
2020-12-02 17:37:21,268 | WARN  | DelegationTokenRenewer #3402 | Unable to add 
the application to the delegation token renewer. | 
DelegationTokenRenewer.java:949
java.io.IOException: Failed to renew token: Kind: TIMELINE_DELEGATION_TOKEN, 
Service: 192.168.0.2:8190, Ident: (TIMELINE_DELEGATION_TOKEN owner=bnn, 
renewer=mapred, realUser=executor, issueDate=1606880472758, 
maxDate=1607485272758, sequenceNumber=11581, masterKeyId=13)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:508)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$1100(DelegationTokenRenewer.java:80)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:945)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:922)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: HTTP status [403], message 
[org.apache.hadoop.security.token.SecretManager$InvalidToken: Unable to find 
master key for keyId=13 from cache. Failed to renew an unexpired token 
(TIMELINE_DELEGATION_TOKEN owner=bnn, renewer=mapred, realUser=executor, 
issueDate=1606880472758, maxDate=1607485272758, sequenceNumber=11581, 
masterKeyId=13) with sequenceNumber=11581]
at 
org.apache.hadoop.util.HttpExceptionUtils.validateResponse(HttpExceptionUtils.java:174)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:323)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.renewDelegationToken(DelegationTokenAuthenticator.java:239)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.renewDelegationToken(DelegationTokenAuthenticatedURL.java:426)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:247)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:227)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientRetryOpForOperateDelegationToken.run(TimelineConnector.java:431)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:334)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineConnector.operateDelegationToken(TimelineConnector.java:218)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.renewDelegationToken(TimelineClientImpl.java:250)
at 
org.apache.hadoop.yarn.security.client.TimelineDelegationTokenIdentifier$Renewer.renew(TimelineDelegationTokenIdentifier.java:81)
at org.apache.hadoop.security.token.Token.renew(Token.java:490)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:634)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:631)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:630)
at 

[jira] [Updated] (YARN-10666) In ProcfsBasedProcessTree reading smaps file show Permission denied

2021-03-02 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10666:

Description: 
When job submitter user is other than NM's user.
 Then NM failed to read /proc//smaps file.
 Because smaps file is owned by job submitter user, which is not able to read 
by NM's user.



  was:
When job submitter user is other than NM's user.
 Then NM failed to read /proc//smaps file.
 Because smaps file is owned by job submitter user, which is not able to read 
by NM's user.
{noformat}
*no* further _formatting_ is done here{noformat}
{noformat}
*no* further _formatting_ is done here{noformat}
 


> In ProcfsBasedProcessTree reading smaps file show Permission denied
> ---
>
> Key: YARN-10666
> URL: https://issues.apache.org/jira/browse/YARN-10666
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sushanta Sen
>Priority: Major
>
> When job submitter user is other than NM's user.
>  Then NM failed to read /proc//smaps file.
>  Because smaps file is owned by job submitter user, which is not able to read 
> by NM's user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10666) In ProcfsBasedProcessTree reading smaps file show Permission denied

2021-03-02 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10666:

Description: 
When job submitter user is other than NM's user.
 Then NM failed to read /proc//smaps file.
 Because smaps file is owned by job submitter user, which is not able to read 
by NM's user.
{noformat}
*no* further _formatting_ is done here{noformat}
{noformat}
*no* further _formatting_ is done here{noformat}
 

  was:
When job submitter user is other than NM's user.
 Then NM failed to read /proc//smaps file.
 Because smaps file is owned by job submitter user, which is not able to read 
by NM's user.
{noformat}
*no* further _formatting_ is done here{noformat}
!image-2021-03-03-12-33-51-034.png!


> In ProcfsBasedProcessTree reading smaps file show Permission denied
> ---
>
> Key: YARN-10666
> URL: https://issues.apache.org/jira/browse/YARN-10666
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sushanta Sen
>Priority: Major
>
> When job submitter user is other than NM's user.
>  Then NM failed to read /proc//smaps file.
>  Because smaps file is owned by job submitter user, which is not able to read 
> by NM's user.
> {noformat}
> *no* further _formatting_ is done here{noformat}
> {noformat}
> *no* further _formatting_ is done here{noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10666) In ProcfsBasedProcessTree reading smaps file show Permission denied

2021-03-02 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10666:

Description: 
When job submitter user is other than NM's user.
 Then NM failed to read /proc//smaps file.
 Because smaps file is owned by job submitter user, which is not able to read 
by NM's user.
{noformat}
*no* further _formatting_ is done here{noformat}
!image-2021-03-03-12-33-51-034.png!

  was:
When job submitter user is other than NM's user.
Then NM failed to read /proc//smaps file.
Because smaps file is owned by job submitter user, which is not able to read by 
NM's user.


> In ProcfsBasedProcessTree reading smaps file show Permission denied
> ---
>
> Key: YARN-10666
> URL: https://issues.apache.org/jira/browse/YARN-10666
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sushanta Sen
>Priority: Major
>
> When job submitter user is other than NM's user.
>  Then NM failed to read /proc//smaps file.
>  Because smaps file is owned by job submitter user, which is not able to read 
> by NM's user.
> {noformat}
> *no* further _formatting_ is done here{noformat}
> !image-2021-03-03-12-33-51-034.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10666) In ProcfsBasedProcessTree reading smaps file show Permission denied

2021-03-02 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10666:
---

 Summary: In ProcfsBasedProcessTree reading smaps file show 
Permission denied
 Key: YARN-10666
 URL: https://issues.apache.org/jira/browse/YARN-10666
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sushanta Sen


When job submitter user is other than NM's user.
Then NM failed to read /proc//smaps file.
Because smaps file is owned by job submitter user, which is not able to read by 
NM's user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10634) The config parameter "mapreduce.job.num-opportunistic-maps-percent" is confusing when requesting Opportunistic containers in YARN job

2021-02-18 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10634:

Description: 
Execute the below job by Passing this config 
-Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the 
number of containers to be launched as Opportunistic, not in % of the total 
mappers requested , i think this configuration name should be modified 
accordingly and also {color:#de350b}the same gets printed in AM logs which also 
needs to be corrected accordingly.{color}

Job Command: hadoop jar 
HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
 pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" 
20 99

In AM logs this message is displayed. it should be {color:#de350b}20 , not 20% 
{color}? 
 “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the 
mappers{color} will be scheduled using OPPORTUNISTIC containers | 
RMContainerAllocator.java:257”

Job Command: hadoop jar 
HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
 pi {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 
20 99

In AM logs this message is displayed. It should be {color:#de350b}100, not 
100%{color} ?

2021-02-10 20:28:16,016 | INFO  | main | {color:#de350b}100% of the 
mapper{color}s will be scheduled using OPPORTUNISTIC containers | 
RMContainerAllocator.java:257

  was:
Execute the below job by Passing this config 
-Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the 
number of containers to be launched as Opportunistic, not in % of the total 
mappers requested , i think this configuration name should be modified 
accordingly and also {color:#de350b}the same gets printed in AM logs{color}

Job Command: hadoop jar 
HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
 pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" 
20 99

In AM logs this message is displayed. it should be {color:#de350b}20 , not 20% 
{color}? 
 “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the 
mappers{color} will be scheduled using OPPORTUNISTIC containers | 
RMContainerAllocator.java:257”

Job Command: hadoop jar 
HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
 pi {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 
20 99

In AM logs this message is displayed. It should be {color:#de350b}100, not 
100%{color} ?

2021-02-10 20:28:16,016 | INFO  | main | {color:#de350b}100% of the 
mapper{color}s will be scheduled using OPPORTUNISTIC containers | 
RMContainerAllocator.java:257


> The config parameter "mapreduce.job.num-opportunistic-maps-percent" is 
> confusing when requesting Opportunistic containers in YARN job
> -
>
> Key: YARN-10634
> URL: https://issues.apache.org/jira/browse/YARN-10634
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Minor
>
> Execute the below job by Passing this config 
> -Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the 
> number of containers to be launched as Opportunistic, not in % of the total 
> mappers requested , i think this configuration name should be modified 
> accordingly and also {color:#de350b}the same gets printed in AM logs which 
> also needs to be corrected accordingly.{color}
> Job Command: hadoop jar 
> HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" 
> 20 99
> In AM logs this message is displayed. it should be {color:#de350b}20 , not 
> 20% {color}? 
>  “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the 
> mappers{color} will be scheduled using OPPORTUNISTIC containers | 
> RMContainerAllocator.java:257”
> Job Command: hadoop jar 
> HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  pi 
> {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 20 
> 99
> In AM logs this message is displayed. It should be {color:#de350b}100, not 
> 100%{color} ?
> 2021-02-10 20:28:16,016 | INFO  | main | {color:#de350b}100% of the 
> mapper{color}s will be scheduled using OPPORTUNISTIC containers | 
> RMContainerAllocator.java:257



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, 

[jira] [Updated] (YARN-10634) The config parameter "mapreduce.job.num-opportunistic-maps-percent" is confusing when requesting Opportunistic containers in YARN job

2021-02-18 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10634:

Summary: The config parameter 
"mapreduce.job.num-opportunistic-maps-percent" is confusing when requesting 
Opportunistic containers in YARN job  (was: The config parameter name is 
confusing when requesting Opportunistic containers in YARN job)

> The config parameter "mapreduce.job.num-opportunistic-maps-percent" is 
> confusing when requesting Opportunistic containers in YARN job
> -
>
> Key: YARN-10634
> URL: https://issues.apache.org/jira/browse/YARN-10634
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Reporter: Sushanta Sen
>Priority: Minor
>
> Execute the below job by Passing this config 
> -Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the 
> number of containers to be launched as Opportunistic, not in % of the total 
> mappers requested , i think this configuration name should be modified 
> accordingly and also {color:#de350b}the same gets printed in AM logs{color}
> Job Command: hadoop jar 
> HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" 
> 20 99
> In AM logs this message is displayed. it should be {color:#de350b}20 , not 
> 20% {color}? 
>  “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the 
> mappers{color} will be scheduled using OPPORTUNISTIC containers | 
> RMContainerAllocator.java:257”
> Job Command: hadoop jar 
> HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  pi 
> {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 20 
> 99
> In AM logs this message is displayed. It should be {color:#de350b}100, not 
> 100%{color} ?
> 2021-02-10 20:28:16,016 | INFO  | main | {color:#de350b}100% of the 
> mapper{color}s will be scheduled using OPPORTUNISTIC containers | 
> RMContainerAllocator.java:257



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10634) The config parameter name is confusing when requesting Opportunistic containers in YARN job

2021-02-18 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10634:
---

 Summary: The config parameter name is confusing when requesting 
Opportunistic containers in YARN job
 Key: YARN-10634
 URL: https://issues.apache.org/jira/browse/YARN-10634
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications
Reporter: Sushanta Sen


Execute the below job by Passing this config 
-Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the 
number of containers to be launched as Opportunistic, not in % of the total 
mappers requested , i think this configuration name should be modified 
accordingly and also {color:#de350b}the same gets printed in AM logs{color}

Job Command: hadoop jar 
HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
 pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" 
20 99

In AM logs this message is displayed. it should be {color:#de350b}20 , not 20% 
{color}? 
 “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the 
mappers{color} will be scheduled using OPPORTUNISTIC containers | 
RMContainerAllocator.java:257”

Job Command: hadoop jar 
HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
 pi {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 
20 99

In AM logs this message is displayed. It should be {color:#de350b}100, not 
100%{color} ?

2021-02-10 20:28:16,016 | INFO  | main | {color:#de350b}100% of the 
mapper{color}s will be scheduled using OPPORTUNISTIC containers | 
RMContainerAllocator.java:257



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10136) In Secure Federation, Router About UI page failed to display the subclusters information appropriately.

2020-02-13 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10136:
---

 Summary: In Secure Federation, Router About UI page failed to 
display the subclusters information appropriately.
 Key: YARN-10136
 URL: https://issues.apache.org/jira/browse/YARN-10136
 Project: Hadoop YARN
  Issue Type: Bug
  Components: federation, yarn
Reporter: Sushanta Sen


In Secure Federation, Router About UI page failed to display the subclusters 
information appropriately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10133) For yarn Federation, there is no routeradmin command to manage

2020-02-12 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10133:
---

 Summary: For yarn Federation, there is no routeradmin command to 
manage
 Key: YARN-10133
 URL: https://issues.apache.org/jira/browse/YARN-10133
 Project: Hadoop YARN
  Issue Type: Bug
  Components: federation, yarn
Reporter: Sushanta Sen


In an yarn Federated cluster, there does not exist any routeradmin command to 
manage, like HDFS has 'dfsrouteradmin' command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10132) For Federation,yarn applicationattempt fail command throws an exception

2020-02-12 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10132:
---

 Summary: For Federation,yarn applicationattempt fail command 
throws an exception
 Key: YARN-10132
 URL: https://issues.apache.org/jira/browse/YARN-10132
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sushanta Sen


yarn applicationattempt fail command is failing with exception  
“org.apache.commons.lang.NotImplementedException: Code is not implemented”.


{noformat}
 ./yarn applicationattempt -fail appattempt_1581497870689_0001_01
Failing attempt appattempt_1581497870689_0001_01 of application 
application_1581497870689_0001
2020-02-12 20:48:48,530 INFO impl.YarnClientImpl: Failing application attempt 
appattempt_1581497870689_0001_01
Exception in thread "main" org.apache.commons.lang.NotImplementedException: 
Code is not implemented
at 
org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.failApplicationAttempt(FederationClientInterceptor.java:980)
at 
org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.failApplicationAttempt(RouterClientRMService.java:388)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.failApplicationAttempt(ApplicationClientProtocolPBServiceImpl.java:210)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:581)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.failApplicationAttempt(ApplicationClientProtocolPBClientImpl.java:223)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy8.failApplicationAttempt(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.failApplicationAttempt(YarnClientImpl.java:447)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.failApplicationAttempt(ApplicationCLI.java:985)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:455)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:119)

{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10131) In Federation,few yarn application commands do not work

2020-02-12 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10131:
---

 Summary: In Federation,few yarn application commands do not work
 Key: YARN-10131
 URL: https://issues.apache.org/jira/browse/YARN-10131
 Project: Hadoop YARN
  Issue Type: Bug
  Components: federation, yarn
Reporter: Sushanta Sen


In Federation,the below mentioned yarn application commands do not work.
./yarn app -updatePriority 3 -appId 
./yarn app -changeQueue q1 -appId 
./yarn application -updateLifetime 40 -appId 

All the above prompt the same exception"Code not implemented".




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10122) In Federation,executing yarn container signal command throws an exception

2020-02-09 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10122:
---

 Summary: In Federation,executing yarn container signal command 
throws an exception
 Key: YARN-10122
 URL: https://issues.apache.org/jira/browse/YARN-10122
 Project: Hadoop YARN
  Issue Type: Bug
  Components: federation, yarn
Reporter: Sushanta Sen


Executing yarn container signal command failed, prompting an error 
“org.apache.commons.lang.NotImplementedException: Code is not implemented”.


{noformat}
./yarn container -signal container_e79_1581316978887_0001_01_10
Signalling container container_e79_1581316978887_0001_01_10
2020-02-10 14:51:18,045 INFO impl.YarnClientImpl: Signalling container 
container_e79_1581316978887_0001_01_10 with command OUTPUT_THREAD_DUMP
Exception in thread "main" org.apache.commons.lang.NotImplementedException: 
Code is not implemented
at 
org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.signalToContainer(FederationClientInterceptor.java:993)
at 
org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.signalToContainer(RouterClientRMService.java:403)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.signalToContainer(ApplicationClientProtocolPBServiceImpl.java:629)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:629)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.signalToContainer(ApplicationClientProtocolPBClientImpl.java:620)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy8.signalToContainer(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.signalToContainer(YarnClientImpl.java:949)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.signalToContainer(ApplicationCLI.java:717)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:478)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:119)

{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10121) In Federation executing yarn queue status command throws an exception

2020-02-09 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10121:
---

 Summary: In Federation executing yarn queue status command throws 
an exception
 Key: YARN-10121
 URL: https://issues.apache.org/jira/browse/YARN-10121
 Project: Hadoop YARN
  Issue Type: Bug
  Components: federation, yarn
Reporter: Sushanta Sen


yarn queue status is failing, prompting an error 
“org.apache.commons.lang.NotImplementedException: Code is not implemented”.


{noformat}
 ./yarn queue -status default
Exception in thread "main" org.apache.commons.lang.NotImplementedException: 
Code is not implemented
at 
org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.getQueueInfo(FederationClientInterceptor.java:715)
at 
org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.getQueueInfo(RouterClientRMService.java:246)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:328)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:591)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getQueueInfo(ApplicationClientProtocolPBClientImpl.java:341)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy8.getQueueInfo(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getQueueInfo(YarnClientImpl.java:650)
at 
org.apache.hadoop.yarn.client.cli.QueueCLI.listQueue(QueueCLI.java:111)
at org.apache.hadoop.yarn.client.cli.QueueCLI.run(QueueCLI.java:78)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.yarn.client.cli.QueueCLI.main(QueueCLI.java:50)

{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2020-02-07 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10120:

Description: 
In Federation Router Nodes/Applications/About pages throws 500 exception when 
https is enabled.

yarn.router.webapp.https.address =router ip:8091



{noformat}
2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
handling URI: /cluster/apps
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
at 
com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:539)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at 
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:259)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at 

[jira] [Updated] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2020-02-07 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10120:

Description: 
In Federation Router Nodes/Applications/About pages throws 500 exception when 
https is enabled.

yarn.router.webapp.https.address =0.0.0.0:8091



{noformat}
2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
handling URI: /cluster/apps
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
at 
com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:539)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at 
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:259)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at 

[jira] [Updated] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2020-02-07 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-10120:

Description: 
In Federation Router Nodes/Applications/About pages throws 500 exception when 
https is enabled.
yarn.router.webapp.https.address =0.0.0.0:8091



{noformat}
2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
handling URI: /cluster/apps
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
at 
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
at 
com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:539)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at 
org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:259)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at 

[jira] [Created] (YARN-10120) In Federation Router UI does not launch when https is enabled

2020-02-07 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10120:
---

 Summary: In Federation Router UI does not launch when https is 
enabled
 Key: YARN-10120
 URL: https://issues.apache.org/jira/browse/YARN-10120
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
 Environment: In Federation Router UI does not launch with https secure 
port when the below parameter is set in router yarn-site.xml

yarn.router.webapp.https.address =0.0.0.0:8091
Reporter: Sushanta Sen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10111) In Federation cluster Distributed Shell Application submission fails as YarnClient#getQueueInfo is not implemented

2020-01-29 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10111:
---

 Summary: In Federation cluster Distributed Shell Application 
submission fails as YarnClient#getQueueInfo is not implemented
 Key: YARN-10111
 URL: https://issues.apache.org/jira/browse/YARN-10111
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sushanta Sen


In Federation cluster Distributed Shell Application submission fails as 
YarnClient#getQueueInfo is not implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10110) In Yarn Secure Federated cluster ,if hadoop.security.authorization= true in Router & client core-site.xml, on executing a job it throws the below error.

2020-01-28 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10110:
---

 Summary: In Yarn Secure Federated cluster ,if 
hadoop.security.authorization= true in Router & client core-site.xml, on 
executing a job it throws the below error.
 Key: YARN-10110
 URL: https://issues.apache.org/jira/browse/YARN-10110
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sushanta Sen


【Precondition】:
1. Secure Federated cluster is available
2. Add the below configuration in Router and client core-site.xml
hadoop.security.authorization= true 
3. Restart the router service

【Test step】:
1. Go to router client bin path and submit a MR PI job
2. Observe the client console screen

【Expect Output】:
No error should be thrown and Job should be successful

【Actual Output】:
Job failed prompting "Protocol interface 
org.apache.hadoop.yarn.api.ApplicationClientProtocolPB is not known.,"

【Additional Note】:
 But on setting the parameter as false, job is submitted and success.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition

2019-10-25 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-9935:
---
Description: 
【Precondition】:
1. Install the cluster
2. *{color:#4C9AFF}WebAppProxyServer service installed in 1 VM and RMs 
installed in 2 VMs{color}*
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"


  was:
【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"



> SSLHandshakeException thrown when HTTPS is enabled in AM web server in one 
> certain condition
> 
>
> Key: YARN-9935
> URL: https://issues.apache.org/jira/browse/YARN-9935
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: amrmproxy
>Reporter: Sushanta Sen
>Priority: Major
>
> 【Precondition】:
> 1. Install the cluster
> 2. *{color:#4C9AFF}WebAppProxyServer service installed in 1 VM and RMs 
> installed in 2 VMs{color}*
> 3. Enables all the HTTPS configuration required 
> yarn.resourcemanager.application-https.policy
> STRICT
> yarn.app.mapreduce.am.webapp.https.enabled
> true
> yarn.app.mapreduce.am.webapp.https.client.auth
> true
> 4. RM HA enabled
> 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
> 6. Cluster should be up and running
> 【Test step】:
> 1.Submit an application
> 2. Open Application Master link from the applicationID from RM UI
> 【Expect Output】:
> No error should be thrown and JOb should be successful
> 【Actual Output】:
> SSLHandshakeException thrown , although Job is successful.
> "javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition

2019-10-25 Thread Sushanta Sen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanta Sen updated YARN-9935:
---
Description: 
【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"


  was:
【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. Active RM is running in VM2, standby in VM1
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"



> SSLHandshakeException thrown when HTTPS is enabled in AM web server in one 
> certain condition
> 
>
> Key: YARN-9935
> URL: https://issues.apache.org/jira/browse/YARN-9935
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: amrmproxy
>Reporter: Sushanta Sen
>Priority: Major
>
> 【Precondition】:
> 1. Install the cluster
> 2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
> 3. Enables all the HTTPS configuration required 
> yarn.resourcemanager.application-https.policy
> STRICT
> yarn.app.mapreduce.am.webapp.https.enabled
> true
> yarn.app.mapreduce.am.webapp.https.client.auth
> true
> 4. RM HA enabled
> 5. *{color:#4C9AFF}Active RM is running in VM2, standby in VM1{color}*
> 6. Cluster should be up and running
> 【Test step】:
> 1.Submit an application
> 2. Open Application Master link from the applicationID from RM UI
> 【Expect Output】:
> No error should be thrown and JOb should be successful
> 【Actual Output】:
> SSLHandshakeException thrown , although Job is successful.
> "javax.net.ssl.SSLHandshakeException: 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9935) SSLHandshakeException thrown when HTTPS is enabled in AM web server in one certain condition

2019-10-25 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-9935:
--

 Summary: SSLHandshakeException thrown when HTTPS is enabled in AM 
web server in one certain condition
 Key: YARN-9935
 URL: https://issues.apache.org/jira/browse/YARN-9935
 Project: Hadoop YARN
  Issue Type: Bug
  Components: amrmproxy
Reporter: Sushanta Sen


【Precondition】:
1. Install the cluster
2. WebAppProxyServer service installed in 1 VM and RMs installed in 2 VMs
3. Enables all the HTTPS configuration required 
yarn.resourcemanager.application-https.policy
STRICT

yarn.app.mapreduce.am.webapp.https.enabled
true

yarn.app.mapreduce.am.webapp.https.client.auth
true

4. RM HA enabled
5. Active RM is running in VM2, standby in VM1
6. Cluster should be up and running

【Test step】:
1.Submit an application
2. Open Application Master link from the applicationID from RM UI

【Expect Output】:
No error should be thrown and JOb should be successful

【Actual Output】:
SSLHandshakeException thrown , although Job is successful.
"javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 
sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
valid certification path to requested target"




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9849) Leaf queues not inheriting parent queue status after adding status as “RUNNING” and thereafter, commenting the same in capacity-scheduler.xml

2019-09-20 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-9849:
--

 Summary: Leaf queues not inheriting parent queue status after 
adding status as “RUNNING” and thereafter, commenting the same in 
capacity-scheduler.xml 
 Key: YARN-9849
 URL: https://issues.apache.org/jira/browse/YARN-9849
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Reporter: Sushanta Sen


【Precondition】:
1. Install the cluster
2. Config Queues with more numbers say 2 parent [default,q1] & 4 leaf [q2,q3]
3. Cluster should be up and running

【Test step】:
1.By default leaf quques inherit parent status 
2.Change leaf queues status  as "RUNNING" explicitly
3. Run refresh command, leaf queues status shown as "RUNNING" in CLI/UI
4. Therafter,change the leaft queues status to "STOPPED"  and refresh command
5. Run refresh command, leaf queues status shown as "STOPPING" in CLI/UI
6. Now comment the leafy queues status and run refresh queues
7.Observe

【Expect Output】:
The leaf queues status should be displayed as "RUNNING" inheriting from the 
parent queue.

【Actual Output】:
Still displays the leaf queues status as "STOPPED" raather than inheriting the 
same from parent which is in RUNNING




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org