[jira] [Commented] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails

2024-02-28 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821959#comment-17821959
 ] 

Bilwa S T commented on YARN-11654:
--

cc [~slfan1989] [~steve_l]

> [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
> 
>
> Key: YARN-11654
> URL: https://issues.apache.org/jira/browse/YARN-11654
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>  Labels: pull-request-available
>
> [ERROR]   TestLinuxContainerExecutorWithMocks.testStartLocalizer:310
> Expected size:<26> but was:<28> in:
> <["nobody",
> "test",
> "0",
> "application_0",
> "12345",
> "/bin/nmPrivateCTokensPath",
> 
> "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir",
> "src/test/resources",
> 
> "/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java",
> "-classpath",
> 
> 

[jira] [Comment Edited] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails

2024-02-05 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814336#comment-17814336
 ] 

Bilwa S T edited comment on YARN-11654 at 2/5/24 12:23 PM:
---

cc [~snemeth] [~ayushsaxena] [~brahmareddy]


was (Author: bilwast):
cc [~snemeth][~ayushsaxena][~brahmareddy]

> [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
> 
>
> Key: YARN-11654
> URL: https://issues.apache.org/jira/browse/YARN-11654
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>  Labels: pull-request-available
>
> [ERROR]   TestLinuxContainerExecutorWithMocks.testStartLocalizer:310
> Expected size:<26> but was:<28> in:
> <["nobody",
> "test",
> "0",
> "application_0",
> "12345",
> "/bin/nmPrivateCTokensPath",
> 
> "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir",
> "src/test/resources",
> 
> "/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java",
> "-classpath",
> 
> 

[jira] [Commented] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails

2024-02-05 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814336#comment-17814336
 ] 

Bilwa S T commented on YARN-11654:
--

cc [~snemeth][~ayushsaxena][~brahmareddy]

> [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
> 
>
> Key: YARN-11654
> URL: https://issues.apache.org/jira/browse/YARN-11654
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>  Labels: pull-request-available
>
> [ERROR]   TestLinuxContainerExecutorWithMocks.testStartLocalizer:310
> Expected size:<26> but was:<28> in:
> <["nobody",
> "test",
> "0",
> "application_0",
> "12345",
> "/bin/nmPrivateCTokensPath",
> 
> "/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir",
> "src/test/resources",
> 
> "/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java",
> "-classpath",
> 
> 

[jira] [Updated] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails

2024-02-04 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-11654:
-
Description: 
[ERROR]   TestLinuxContainerExecutorWithMocks.testStartLocalizer:310
Expected size:<26> but was:<28> in:
<["nobody",
"test",
"0",
"application_0",
"12345",
"/bin/nmPrivateCTokensPath",

"/Users/bilwa/code/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir",
"src/test/resources",

"/opt/homebrew/Cellar/openjdk@17/17.0.8/libexec/openjdk.jdk/Contents/Home/bin/java",
"-classpath",


[jira] [Created] (YARN-11654) [JDK17] TestLinuxContainerExecutorWithMocks.testStartLocalizer fails

2024-02-04 Thread Bilwa S T (Jira)
Bilwa S T created YARN-11654:


 Summary: [JDK17] 
TestLinuxContainerExecutorWithMocks.testStartLocalizer fails
 Key: YARN-11654
 URL: https://issues.apache.org/jira/browse/YARN-11654
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.4.0
Reporter: Bilwa S T
Assignee: Bilwa S T


Expected size:<26> but was:<28> in:
<["nobody",
"test",
"0",
"application_0",
"12345",
"/bin/nmPrivateCTokensPath",

"/workspace/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/tmp/nm-local-dir",
"src/test/resources",
"/usr/lib/jvm/jdk-17.0.9/bin/java",
"-classpath",


[jira] [Commented] (YARN-11181) Applications in Pending state as AM resources are not updated when resources from other queue gets released

2022-06-14 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554035#comment-17554035
 ] 

Bilwa S T commented on YARN-11181:
--

cc [~bibinchundatt] [~brahma] [~prabhujoseph]

> Applications in Pending state as AM resources are not updated when resources 
> from other queue gets released
> ---
>
> Key: YARN-11181
> URL: https://issues.apache.org/jira/browse/YARN-11181
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Priority: Major
>
> Configure two queues q1 and q2.
> Lets say AM Resource percent for q1 and q2 is both <5gb, 5vcores>. 
> 1. Run long running application to q1 which occupies 70% of cluster resources
> 2. Run small application to q2 .
> 3. Run one long running job to q2 and few more small jobs.
> 4. Once small application submitted to q2 finishes , AM resources gets 
> decreased to <2gb, 2vcores>
> 5. Kill long running application submitted to q1.
> Now long running job submitted to q2 will be running and all other jobs are 
> in pending state.
> This is because LeafQueue#ActivateApplications gets called only when AM 
> starts running or finishes. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11181) Applications in Pending state as AM resources are not updated when resources from other queue gets released

2022-06-14 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-11181:
-
Description: 
Configure two queues q1 and q2.

Lets say AM Resource percent for q1 and q2 is both <5gb, 5vcores>. 

1. Run long running application to q1 which occupies 70% of cluster resources
2. Run small application to q2 .
3. Run one long running job to q2 and few more small jobs.
4. Once small application submitted to q2 finishes , AM resources gets 
decreased to <2gb, 2vcores>
5. Kill long running application submitted to q1.

Now long running job submitted to q2 will be running and all other jobs are in 
pending state.

This is because LeafQueue#ActivateApplications gets called only when AM starts 
running or finishes. 

> Applications in Pending state as AM resources are not updated when resources 
> from other queue gets released
> ---
>
> Key: YARN-11181
> URL: https://issues.apache.org/jira/browse/YARN-11181
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Priority: Major
>
> Configure two queues q1 and q2.
> Lets say AM Resource percent for q1 and q2 is both <5gb, 5vcores>. 
> 1. Run long running application to q1 which occupies 70% of cluster resources
> 2. Run small application to q2 .
> 3. Run one long running job to q2 and few more small jobs.
> 4. Once small application submitted to q2 finishes , AM resources gets 
> decreased to <2gb, 2vcores>
> 5. Kill long running application submitted to q1.
> Now long running job submitted to q2 will be running and all other jobs are 
> in pending state.
> This is because LeafQueue#ActivateApplications gets called only when AM 
> starts running or finishes. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11181) Applications in Pending state as AM resources are not updated when resources from other queue gets released

2022-06-14 Thread Bilwa S T (Jira)
Bilwa S T created YARN-11181:


 Summary: Applications in Pending state as AM resources are not 
updated when resources from other queue gets released
 Key: YARN-11181
 URL: https://issues.apache.org/jira/browse/YARN-11181
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-09-23 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419162#comment-17419162
 ] 

Bilwa S T commented on YARN-9606:
-

Thanks [~pbacsko]

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0, 3.3.2
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606-branch-3.3-v2.patch, YARN-9606-branch-3.3.v1.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-09-20 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417613#comment-17417613
 ] 

Bilwa S T commented on YARN-9606:
-

Hi [~pbacsko]
can we merge this?

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606-branch-3.3-v2.patch, YARN-9606-branch-3.3.v1.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10812) yarn service number of containers count is wrong when flexing

2021-07-28 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389219#comment-17389219
 ] 

Bilwa S T commented on YARN-10812:
--

cc [~eyang]

> yarn service number of containers count is wrong when flexing
> -
>
> Key: YARN-10812
> URL: https://issues.apache.org/jira/browse/YARN-10812
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> Currently let's say there are 2 containers running in a service.
> User ask for 2 more by flexing it and there is just resources available only 
> for 1 container to run but still number of containers will be updated to 4 
> which is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10824) Title not set for JHS and NM webpages

2021-06-24 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17368815#comment-17368815
 ] 

Bilwa S T commented on YARN-10824:
--

Thanks [~Jim_Brennan] [~epayne] for your review comments. I have updated patch. 
Please check

> Title not set for JHS and NM webpages
> -
>
> Key: YARN-10824
> URL: https://issues.apache.org/jira/browse/YARN-10824
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rajshree Mishra
>Assignee: Bilwa S T
>Priority: Major
> Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch, 
> YARN-10824.002.patch
>
>
> The following issue was reported by one of our internal web security check 
> tools: 
> Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a 
> url similar to:
> [https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22]
> or 
> [https://[hostname]:[nm_port]/node?title=12345]
> sets the page title to be set to the value mentioned.
> [Image attached]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10824) Title not set for JHS and NM webpages

2021-06-24 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10824:
-
Attachment: YARN-10824.002.patch

> Title not set for JHS and NM webpages
> -
>
> Key: YARN-10824
> URL: https://issues.apache.org/jira/browse/YARN-10824
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rajshree Mishra
>Assignee: Bilwa S T
>Priority: Major
> Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch, 
> YARN-10824.002.patch
>
>
> The following issue was reported by one of our internal web security check 
> tools: 
> Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a 
> url similar to:
> [https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22]
> or 
> [https://[hostname]:[nm_port]/node?title=12345]
> sets the page title to be set to the value mentioned.
> [Image attached]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-06-17 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9606:

Attachment: YARN-9606-branch-3.3-v2.patch

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606-branch-3.3-v2.patch, YARN-9606-branch-3.3.v1.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10824) Title not set for JHS and NM webpages

2021-06-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364992#comment-17364992
 ] 

Bilwa S T commented on YARN-10824:
--

cc [~jbrennan] [~epayne]

> Title not set for JHS and NM webpages
> -
>
> Key: YARN-10824
> URL: https://issues.apache.org/jira/browse/YARN-10824
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rajshree Mishra
>Assignee: Bilwa S T
>Priority: Major
> Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch
>
>
> Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a 
> url similar to:
> https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22
> or 
> https://[hostname]:[nm_port]/node?title=12345
> sets the page title to be set to the value mentioned.
> [Image attached]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10824) Title not set for JHS and NM webpages

2021-06-17 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10824:
-
Attachment: YARN-10824.001.patch

> Title not set for JHS and NM webpages
> -
>
> Key: YARN-10824
> URL: https://issues.apache.org/jira/browse/YARN-10824
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rajshree Mishra
>Assignee: Bilwa S T
>Priority: Major
> Attachments: JHS URL.jpg, NM URL.jpg, YARN-10824.001.patch
>
>
> Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a 
> url similar to:
> https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22
> or 
> https://[hostname]:[nm_port]/node?title=12345
> sets the page title to be set to the value mentioned.
> [Image attached]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10824) Title not set for JHS and NM webpages

2021-06-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364990#comment-17364990
 ] 

Bilwa S T commented on YARN-10824:
--

Command injection can happen here. So to avoid that we can just set title to 
JHS and NM page

> Title not set for JHS and NM webpages
> -
>
> Key: YARN-10824
> URL: https://issues.apache.org/jira/browse/YARN-10824
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rajshree Mishra
>Assignee: Bilwa S T
>Priority: Major
> Attachments: JHS URL.jpg, NM URL.jpg
>
>
> Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a 
> url similar to:
> https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22
> or 
> https://[hostname]:[nm_port]/node?title=12345
> sets the page title to be set to the value mentioned.
> [Image attached]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10824) Title not set for JHS and NM webpages

2021-06-16 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10824:


Assignee: Bilwa S T

> Title not set for JHS and NM webpages
> -
>
> Key: YARN-10824
> URL: https://issues.apache.org/jira/browse/YARN-10824
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rajshree Mishra
>Assignee: Bilwa S T
>Priority: Major
> Attachments: JHS URL.jpg, NM URL.jpg
>
>
> Passing a title to the jobHistoryServer(jhs) or Nodemanager(nm) pages using a 
> url similar to:
> https://[hostname]:[jhs_port]/jobhistory/about?title=12345%27%22
> or 
> https://[hostname]:[nm_port]/node?title=12345
> sets the page title to be set to the value mentioned.
> [Image attached]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10800) Yarn service container should be removed from list when its completed/stopped

2021-06-16 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10800:
-
Attachment: YARN-10800.001.patch

> Yarn service container should be removed from list when its completed/stopped
> -
>
> Key: YARN-10800
> URL: https://issues.apache.org/jira/browse/YARN-10800
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-10800.001.patch
>
>
> When we query for containerlist using ServiceClient.getStatus . The list 
> returned containers even finished containers. Currently finished containers 
> are removed only when flex down is done but not when container is 
> shutdown/completed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10812) yarn service number of containers count is wrong when flexing

2021-06-09 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10812:


 Summary: yarn service number of containers count is wrong when 
flexing
 Key: YARN-10812
 URL: https://issues.apache.org/jira/browse/YARN-10812
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T


Currently let's say there are 2 containers running in a service.

User ask for 2 more by flexing it and there is just resources available only 
for 1 container to run but still number of containers will be updated to 4 
which is wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10767) Yarn Logs Command retrying on Standby RM for 30 times

2021-06-02 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17355634#comment-17355634
 ] 

Bilwa S T commented on YARN-10767:
--

Hi [~dmmkr]

Thanks for patch. I have one minor comment:
 * RMHAUtils.findActiveRMHAId can return null if none of the RM's are active. I 
think we should have null check here.

[~jbrennan] can you please take a look at this issue? 

> Yarn Logs Command retrying on Standby RM for 30 times
> -
>
> Key: YARN-10767
> URL: https://issues.apache.org/jira/browse/YARN-10767
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10767.001.patch
>
>
> When ResourceManager HA is enabled and the first RM is unavailable, on 
> executing "bin/yarn logs -applicationId  -am 1", we get 
> ConnectionException for connecting to the first RM, the ConnectionException 
> Occurs for 30 times before it tries to connect to the second RM.
>  
> This can be optimized by trying to fetch the logs from the Active RM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10800) Yarn service container should be removed from list when its completed/stopped

2021-06-01 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10800:


 Summary: Yarn service container should be removed from list when 
its completed/stopped
 Key: YARN-10800
 URL: https://issues.apache.org/jira/browse/YARN-10800
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T


When we query for containerlist using ServiceClient.getStatus . The list 
returned containers even finished containers. Currently finished containers are 
removed only when flex down is done but not when container is 
shutdown/completed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-05-29 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17353772#comment-17353772
 ] 

Bilwa S T commented on YARN-9606:
-

Hi [~pbacsko]

i have attached patch for branch-3.3 . It was failing because new classes had 
some code difference.

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, 
> YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, 
> YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-05-29 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9606:

Attachment: YARN-9606-branch-3.3.v1.patch

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, 
> YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, 
> YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-05-29 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9606:

Attachment: YARN-9606-branch-3.3.v1.patch.patch

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, 
> YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, 
> YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-05-29 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9606:

Attachment: (was: YARN-9606-branch-3.3.v1.patch.patch)

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606-branch-3.3.v1.patch, YARN-9606.003.patch, YARN-9606.004.patch, 
> YARN-9606.005.patch, YARN-9606.006.patch, YARN-9606.007.patch, 
> YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-10767) Yarn Logs Command retrying on Standby RM for 30 times

2021-05-26 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10767:
-
Comment: was deleted

(was: Hi [~dmmkr]

Thanks for patch. I have one minor comment:
 * RMHAUtils.findActiveRMHAId can return null if none of the RM's are active. I 
think we should have null check here.

[~jbrennan] can you please take a look at this issue?)

> Yarn Logs Command retrying on Standby RM for 30 times
> -
>
> Key: YARN-10767
> URL: https://issues.apache.org/jira/browse/YARN-10767
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10767.001.patch
>
>
> When ResourceManager HA is enabled and the first RM is unavailable, on 
> executing "bin/yarn logs -applicationId  -am 1", we get 
> ConnectionException for connecting to the first RM, the ConnectionException 
> Occurs for 30 times before it tries to connect to the second RM.
>  
> This can be optimized by trying to fetch the logs from the Active RM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10767) Yarn Logs Command retrying on Standby RM for 30 times

2021-05-26 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351660#comment-17351660
 ] 

Bilwa S T commented on YARN-10767:
--

Hi [~dmmkr]

Thanks for patch. I have one minor comment:
 * RMHAUtils.findActiveRMHAId can return null if none of the RM's are active. I 
think we should have null check here.

[~jbrennan] can you please take a look at this issue?

> Yarn Logs Command retrying on Standby RM for 30 times
> -
>
> Key: YARN-10767
> URL: https://issues.apache.org/jira/browse/YARN-10767
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
> Attachments: YARN-10767.001.patch
>
>
> When ResourceManager HA is enabled and the first RM is unavailable, on 
> executing "bin/yarn logs -applicationId  -am 1", we get 
> ConnectionException for connecting to the first RM, the ConnectionException 
> Occurs for 30 times before it tries to connect to the second RM.
>  
> This can be optimized by trying to fetch the logs from the Active RM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347692#comment-17347692
 ] 

Bilwa S T commented on YARN-9606:
-

[~pbacsko] Can you please backport this too? as this had dependency on 
YARN-10120 we were not able to backport this. Thank you

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347653#comment-17347653
 ] 

Bilwa S T commented on YARN-10725:
--

There is no major change. You can keep commit msg same as in trunk [~pbacsko]. 
Thank you

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch, 
> YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, 
> YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, 
> YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, 
> image-2021-04-05-16-50-55-238.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347627#comment-17347627
 ] 

Bilwa S T commented on YARN-10725:
--

Yes [~pbacsko]. but one whitespace is there which needs to be fixed. Shall  
upload new patch fixing it?

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch, 
> YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, 
> YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, 
> YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, 
> image-2021-04-05-16-50-55-238.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347338#comment-17347338
 ] 

Bilwa S T commented on YARN-10725:
--

Hi [~brahmareddy] [~pbacsko]

can you please check latest patch? i think we can ignore checkstyle issues. 
will fix whitespace issue

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch, 
> YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, 
> YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, 
> YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, 
> image-2021-04-05-16-50-55-238.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-05-18 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10725:
-
Attachment: YARN-10725-branch-3.3.v5.patch

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch, 
> YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, 
> YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, 
> YARN-10725-branch-3.3.v5.patch, image-2021-04-05-16-48-57-034.png, 
> image-2021-04-05-16-50-55-238.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-05-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346577#comment-17346577
 ] 

Bilwa S T commented on YARN-10258:
--

Thanks [~pbacsko] for committing this. please backport it to 3.3.1 also. 

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-10258-001.patch, YARN-10258-002.patch, 
> YARN-10258-003.patch, YARN-10258-005.patch, YARN-10258-006.patch, 
> YARN-10258-007.patch, YARN-10258-008.patch, YARN-10258-009.patch, 
> YARN-10258-010.patch, YARN-10258_004.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-05-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346112#comment-17346112
 ] 

Bilwa S T commented on YARN-10258:
--

Thanks [~gb.ana...@gmail.com] for patch. Looks good to me. [~pbacsko] can you 
please commit this?

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Attachments: YARN-10258-001.patch, YARN-10258-002.patch, 
> YARN-10258-003.patch, YARN-10258-005.patch, YARN-10258-006.patch, 
> YARN-10258-007.patch, YARN-10258-008.patch, YARN-10258-009.patch, 
> YARN-10258-010.patch, YARN-10258_004.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-05-17 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10725:
-
Attachment: YARN-10725-branch-3.3.v4.patch

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch, 
> YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, 
> YARN-10725-branch-3.3.v3.patch, YARN-10725-branch-3.3.v4.patch, 
> image-2021-04-05-16-48-57-034.png, image-2021-04-05-16-50-55-238.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10755) Multithreaded loading Apps from zk statestore

2021-05-06 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10755:


Assignee: Bilwa S T

> Multithreaded loading Apps from zk statestore
> -
>
> Key: YARN-10755
> URL: https://issues.apache.org/jira/browse/YARN-10755
> Project: Hadoop YARN
>  Issue Type: Improvement
> Environment: version: hadooop-2.8.5
>Reporter: chaosju
>Assignee: Bilwa S T
>Priority: Major
> Attachments: image-2021-04-27-12-55-18-710.png
>
>
> In RM, we may be get a list of applications to be read from state store and 
> then divide the work of reading data associated with each app  to multiple 
> threads.
> I think its import to large clusters.
> h2. Profile
> Profile by  TestZKRMStateStorePerf 
> Params: -appSize 2 -appattemptsize 2 -hostPort localhost:2181 
> Profile Result: loadRMAppState stage cost is 5s.
> Profile logs:
> !image-2021-04-27-12-55-18-710.png!  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9615) Add dispatcher metrics to RM

2021-05-06 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340181#comment-17340181
 ] 

Bilwa S T commented on YARN-9615:
-

[~pbacsko] No problem. I just want this to be merged before 3.3.1 release is 
done. Thanks

> Add dispatcher metrics to RM
> 
>
> Key: YARN-9615
> URL: https://issues.apache.org/jira/browse/YARN-9615
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Qi Zhu
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9615.001.patch, YARN-9615.002.patch, 
> YARN-9615.003.patch, YARN-9615.004.patch, YARN-9615.005.patch, 
> YARN-9615.006.patch, YARN-9615.007.patch, YARN-9615.008.patch, 
> YARN-9615.009.patch, YARN-9615.010.patch, YARN-9615.011.patch, 
> YARN-9615.011.patch, YARN-9615.poc.patch, image-2021-03-04-10-35-10-626.png, 
> image-2021-03-04-10-36-12-441.png, screenshot-1.png
>
>
> It'd be good to have counts/processing times for each event type in RM async 
> dispatcher and scheduler async dispatcher.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10642) Race condition: AsyncDispatcher can get stuck by the changes introduced in YARN-8995

2021-05-06 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340172#comment-17340172
 ] 

Bilwa S T commented on YARN-10642:
--

Hi [~pbacsko]

YARN-8995 is merged to branch-3.1  so we need to backport it to branch-3.1 as 
well.

> Race condition: AsyncDispatcher can get stuck by the changes introduced in 
> YARN-8995
> 
>
> Key: YARN-10642
> URL: https://issues.apache.org/jira/browse/YARN-10642
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.2.1
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Critical
> Fix For: 3.4.0, 3.3.1, 3.2.3
>
> Attachments: MockForDeadLoop.java, YARN-10642-branch-3.2.001.patch, 
> YARN-10642-branch-3.2.002.patch, YARN-10642-branch-3.3.001.patch, 
> YARN-10642.001.patch, YARN-10642.002.patch, YARN-10642.003.patch, 
> YARN-10642.004.patch, YARN-10642.005.patch, deadloop.png, debugfornode.png, 
> put.png, take.png
>
>
> In our cluster, ResouceManager stuck twice within twenty days. Yarn client 
> can't submit application. I got jstack info at second time, then found the 
> reason.
> I analyze all the jstack, I found many thread stuck because can't get 
> LinkedBlockingQueue.putLock. (Note: Sorry for limited space , omit the 
> analytical process)
> The reason is that one thread hold the putLock all the time, 
> printEventQueueDetails will called forEachRemaining, then hold putLock and 
> readLock. The AsyncDispatcher will stuck.
> {code}
> Thread 6526 (IPC Server handler 454 on default port 8030):
>   State: RUNNABLE
>   Blocked count: 29988
>   Waited count: 2035029
>   Stack:
> 
> java.util.concurrent.LinkedBlockingQueue$LBQSpliterator.forEachRemaining(LinkedBlockingQueue.java:926)
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.printEventQueueDetails(AsyncDispatcher.java:270)
> 
> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:295)
> 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.handleProgress(DefaultAMSProcessor.java:408)
> 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:215)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
> 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
> 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:432)
> 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
> 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
> 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
> org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1040)
> org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:958)
> java.security.AccessController.doPrivileged(Native Method)
> {code}
> I analyze LinkedBlockingQueue's source code. I found forEachRemaining in 
> LinkedBlockingQueue.LBQSpliterator may stuck, when forEachRemaining and take 
> are called in different thread. 
> YARN-8995 introduce printEventQueueDetails method, 
> "eventQueue.stream().collect" will called forEachRemaining method.
> Let's see why? "put.png" shows that how to put("a"), "take.png" shows that 
> how to take()。Specical Node: The removed Node will point itself for help gc!!!
> The key point code is in forEachRemaining, we see LBQSpliterator use 
> forEachRemaining to visit all Node. But when got item value from Node, will 
> release the lock. If at this time, take() will be called. 
> The variable 'p' in forEachRemaining may point a Node which point itself, 
> then forEachRemaining will be in dead loop. You can see it in "deadloop.png"
> Let's see a simple uni-test, Let's forEachRemaining called more slow than 
> take, the problem will reproduction。uni-test is MockForDeadLoop.java.
> I debug MockForDeadLoop.java, and see a Node point itself. You can see pic 
> 

[jira] [Commented] (YARN-9615) Add dispatcher metrics to RM

2021-05-06 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17340015#comment-17340015
 ] 

Bilwa S T commented on YARN-9615:
-

[~pbacsko] can you please backport it to branch-3.3 ?

> Add dispatcher metrics to RM
> 
>
> Key: YARN-9615
> URL: https://issues.apache.org/jira/browse/YARN-9615
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Qi Zhu
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9615.001.patch, YARN-9615.002.patch, 
> YARN-9615.003.patch, YARN-9615.004.patch, YARN-9615.005.patch, 
> YARN-9615.006.patch, YARN-9615.007.patch, YARN-9615.008.patch, 
> YARN-9615.009.patch, YARN-9615.010.patch, YARN-9615.011.patch, 
> YARN-9615.011.patch, YARN-9615.poc.patch, image-2021-03-04-10-35-10-626.png, 
> image-2021-03-04-10-36-12-441.png, screenshot-1.png
>
>
> It'd be good to have counts/processing times for each event type in RM async 
> dispatcher and scheduler async dispatcher.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10745) Change Log level from info to debug for few logs and remove unnecessary debuglog checks

2021-05-05 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17339557#comment-17339557
 ] 

Bilwa S T commented on YARN-10745:
--

+1 (Non-binding) on YARN-10745.004.patch

[~brahmareddy] [~ebadger] can you please help in review and committ?

> Change Log level from info to debug for few logs and remove unnecessary 
> debuglog checks
> ---
>
> Key: YARN-10745
> URL: https://issues.apache.org/jira/browse/YARN-10745
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Minor
> Attachments: YARN-10745.001.patch, YARN-10745.002.patch, 
> YARN-10745.003.patch, YARN-10745.004.patch
>
>
> Change the info log level to debug for few logs so that the load on the 
> logger decreases in large cluster and improves the performance.
> Remove the unnecessary isDebugEnabled() checks for printing strings without 
> any string concatenation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10745) Change Log level from info to debug for few logs and remove unnecessary debuglog checks

2021-04-29 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335401#comment-17335401
 ] 

Bilwa S T commented on YARN-10745:
--

Hi [~dmmkr]

Thanks for the patch. I have few minor comments
 * In ProportionalCapacityPreemptionPolicy.java  LOG.isDebugEnabled() check can 
be removed for below log

          
{quote}   LOG.debug("Send to scheduler: in app={} " +
                       "#containers-to-be-preemptionCandidates={}", 
appAttemptId,
                     e.getValue().size());
          
{quote}
 
 * Why do we need LOG.isDebugEnabled() check in AsyncDispatcher.java

Few suggestions
 *     In NodesListManager.java we can print below log only if either of the 
sets is not empty

{quote}               LOG.info("hostsReader include:\{" +StringUtils.join(",", 
hostsReader.getHosts()) +"} exclude:{" +
               StringUtils.join(",", hostsReader.getExcludedHosts()) + "}");
{quote}
            

> Change Log level from info to debug for few logs and remove unnecessary 
> debuglog checks
> ---
>
> Key: YARN-10745
> URL: https://issues.apache.org/jira/browse/YARN-10745
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Minor
> Attachments: YARN-10745.001.patch
>
>
> Change the info log level to debug for few logs so that the load on the 
> logger decreases in large cluster and improves the performance.
> Remove the unnecessary isDebugEnabled() checks for printing strings without 
> any string concatenation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-04-26 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10670:
-
Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC -promote_opportunistic_after_start

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Application Failure: desired = 20, completed = 20, allocated = 20, failed = 1, 
diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for 
Guaranateed container.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.

  was:
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC -promote_opportunistic_after_start

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.


> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM yarn-site.xml ::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]yarn-site.xml ::: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC -promote_opportunistic_after_start
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Application Failure: desired = 20, completed = 20, allocated = 20, failed = 
> 1, diagnostics = [2021-02-09 22:11:48.440]Container killed to make room for 
> Guaranateed container.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room for

2021-04-26 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10670:
-
Description: 
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC -promote_opportunistic_after_start

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.

  was:
Preconditions:
 # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
 # Set the below parameters  in RM yarn-site.xml ::
 yarn.resourcemanager.opportunistic-container-allocation.enabled
 true
 
 # Set this in NM[s]yarn-site.xml ::: 
 yarn.nodemanager.opportunistic-containers-max-queue-length
 30
 

 
 Test Steps:

Job Command : : yarn 
org.apache.hadoop.yarn.applications.distributedshell.Client jar 
HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
 -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
OPPORTUNISTIC

Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics message
{noformat}
Attempt recovered after RM restartApplication Failure: desired = 20, completed 
= 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
22:11:48.440]Container De-queued to meet NM queuing limits.
[2021-02-09 22:11:48.441]Container terminated before launch.
{noformat}
 Expected Result: Distributed Shell Yarn Job should not fail.


> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM yarn-site.xml ::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]yarn-site.xml ::: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1*.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC -promote_opportunistic_after_start
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Attempt recovered after RM restartApplication Failure: desired = 20, 
> completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
> 22:11:48.440]Container De-queued to meet NM queuing limits.
> [2021-02-09 22:11:48.441]Container terminated before launch.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types

2021-04-26 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332275#comment-17332275
 ] 

Bilwa S T commented on YARN-10691:
--

cc [~epayne] [~jbrennan]

> DominantResourceCalculator isInvalidDivisor should consider only countable 
> resource types
> -
>
> Key: YARN-10691
> URL: https://issues.apache.org/jira/browse/YARN-10691
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10691.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330908#comment-17330908
 ] 

Bilwa S T edited comment on YARN-10732 at 4/23/21, 4:51 PM:


[~gandras] [~pbacsko] As part of YARN-10260 transitioning from DRAINING to 
RUNNING state was added. Basically it was added if user had stopped queue by 
mistake then he can start it back. With your patch queue cannot be transitioned 
to RUNNING state.  Can you please explain in detail about your use case? 


was (Author: bilwast):
[~gandras] [~pbacsko] As part of YARN-10260 transitioning from DRAINING to 
RUNNING state was added. Basically it was added if user had stopped queue by 
mistake then he can start it back. With your patch queue cannot be transitioned 
to RUNNING state. 

> Disallow restarting a queue while it is in DRAINING state on CS 
> reinitialization
> 
>
> Key: YARN-10732
> URL: https://issues.apache.org/jira/browse/YARN-10732
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10732.001.patch
>
>
> CSConfigValidator#validateQueueHierarchy does not check a state where the old 
> queue is in DRAINING state but the new queue state is RUNNING. User should 
> wait until a queue is fully stopped.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330908#comment-17330908
 ] 

Bilwa S T commented on YARN-10732:
--

[~gandras] [~pbacsko] As part of YARN-10260 transitioning from DRAINING to 
RUNNING state was added. Basically it was added if user had stopped queue by 
mistake then he can start it back. With your patch queue cannot be transitioned 
to RUNNING state. 

> Disallow restarting a queue while it is in DRAINING state on CS 
> reinitialization
> 
>
> Key: YARN-10732
> URL: https://issues.apache.org/jira/browse/YARN-10732
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10732.001.patch
>
>
> CSConfigValidator#validateQueueHierarchy does not check a state where the old 
> queue is in DRAINING state but the new queue state is RUNNING. User should 
> wait until a queue is fully stopped.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types

2021-04-23 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10691:
-
Attachment: YARN-10691.001.patch

> DominantResourceCalculator isInvalidDivisor should consider only countable 
> resource types
> -
>
> Key: YARN-10691
> URL: https://issues.apache.org/jira/browse/YARN-10691
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10691.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10469) The accuracy of the percentage values in the same chart on the YARN 'Cluster OverView' page are inconsistent

2021-04-20 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325828#comment-17325828
 ] 

Bilwa S T commented on YARN-10469:
--

Hi [~tangzhankun]

PR is merged for this jira. Can we resolve this?

> The accuracy of the percentage values in the same chart on the YARN 'Cluster 
> OverView' page are inconsistent
> 
>
> Key: YARN-10469
> URL: https://issues.apache.org/jira/browse/YARN-10469
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-ui-v2
>Affects Versions: 3.1.1
>Reporter: akiyamaneko
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: reproduce.png
>
>
> The accuracy of the percentage values in the same chart on the YARN 'Cluster 
> OverView' page are inconsistent, show as secreenshot in the attachment..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-04-14 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10725:
-
Attachment: YARN-10725-branch-3.3.v2.patch

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch, 
> YARN-10725-branch-3.3.patch, YARN-10725-branch-3.3.v2.patch, 
> image-2021-04-05-16-48-57-034.png, image-2021-04-05-16-50-55-238.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-04-05 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10725:
-
Attachment: YARN-10725-branch-3.3.patch

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch, YARN-10725-branch-3.3.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-04-04 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314662#comment-17314662
 ] 

Bilwa S T commented on YARN-10725:
--

Hi [~brahmareddy] 

As discussed i have attached patch for this to backport to branch-3.3 . Please 
do check. Thanks

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-04-04 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10725:
-
Attachment: YARN-10120-branch-3.3.patch

> Backport YARN-10120 to branch-3.3
> -
>
> Key: YARN-10725
> URL: https://issues.apache.org/jira/browse/YARN-10725
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10120-branch-3.3.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2021-03-31 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312088#comment-17312088
 ] 

Bilwa S T commented on YARN-10120:
--

[~brahmareddy] I have raised YARN-10725 to backport to branch-3.3

> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled
> --
>
> Key: YARN-10120
> URL: https://issues.apache.org/jira/browse/YARN-10120
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Critical
> Fix For: 3.4.0
>
> Attachments: YARN-10120-YARN-7402.patch, 
> YARN-10120-YARN-7402.v2.patch, YARN-10120-addendum-01.patch, 
> YARN-10120-branch-3.3.patch, YARN-10120-branch-3.3.v2.patch, 
> YARN-10120.001.patch, YARN-10120.002.patch
>
>
> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled.
> yarn.router.webapp.https.address =router ip:8091
> {noformat}
> 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: /cluster/apps
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>   at 
> com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> 

[jira] [Created] (YARN-10725) Backport YARN-10120 to branch-3.3

2021-03-31 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10725:


 Summary: Backport YARN-10120 to branch-3.3
 Key: YARN-10725
 URL: https://issues.apache.org/jira/browse/YARN-10725
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-03-26 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17309378#comment-17309378
 ] 

Bilwa S T commented on YARN-9606:
-

[~brahmareddy] This can be backported once YARN-10120 is merged to branch-3.3

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-23 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17306824#comment-17306824
 ] 

Bilwa S T commented on YARN-10697:
--

[~Jim_Brennan] I have changed method name. Please check updated patch. Thanks

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, YARN-10697.002.patch, 
> YARN-10697.003.patch, image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-23 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10697:
-
Attachment: YARN-10697.003.patch

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, YARN-10697.002.patch, 
> YARN-10697.003.patch, image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-20 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305380#comment-17305380
 ] 

Bilwa S T commented on YARN-10697:
--

Hi [~Jim_Brennan]

I have attached .002 patch with latest changes. Please review. Thanks

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, YARN-10697.002.patch, 
> image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-20 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10697:
-
Attachment: YARN-10697.002.patch

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, YARN-10697.002.patch, 
> image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-20 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10697:
-
Attachment: (was: YARN-10697.002.patch)

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-20 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10697:
-
Attachment: YARN-10697.002.patch

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, YARN-10697.002.patch, 
> image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304629#comment-17304629
 ] 

Bilwa S T commented on YARN-10697:
--

Thanks [~Jim_Brennan] [~jhung] for your comments.

I basically added changes in Resource#toString so that its easier for user to 
read. I agree its not correct to add it there as its called from many other 
places. So can we introduce a new method in Resource.java which can print it in 
MB|GB|TB?

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303123#comment-17303123
 ] 

Bilwa S T commented on YARN-10697:
--

[~epayne]  [~jbrennan] can you please take a look at this?

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-17 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10697:
-
Attachment: YARN-10697.001.patch

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303120#comment-17303120
 ] 

Bilwa S T commented on YARN-10697:
--

In YARN-10251 in if case they removed multiplying by BYTES_IN_MB whereas in 
else case it was missed. 

!image-2021-03-17-11-30-57-216.png!

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-17 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10697:
-
Attachment: image-2021-03-17-11-30-57-216.png

> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-16 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10697:


 Summary: Resources are displayed in bytes in UI for schedulers 
other than capacity
 Key: YARN-10697
 URL: https://issues.apache.org/jira/browse/YARN-10697
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T


Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
passes resources in bytes . Also we should display memory in GB for better 
readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types

2021-03-15 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10691:
-
Summary: DominantResourceCalculator isInvalidDivisor should consider only 
countable resource types  (was: DominantResourceCalculator divide and ratio 
methods should consider only countable resource types)

> DominantResourceCalculator isInvalidDivisor should consider only countable 
> resource types
> -
>
> Key: YARN-10691
> URL: https://issues.apache.org/jira/browse/YARN-10691
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-03-15 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301434#comment-17301434
 ] 

Bilwa S T commented on YARN-10588:
--

Thanks [~Jim_Brennan] and [~epayne] for review comments.

I have raised YARN-10691 to handle above issue. I think this one can be merged.

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch, YARN-10588.004.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10691) DominantResourceCalculator divide and ratio methods should consider only countable resource types

2021-03-15 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10691:


 Summary: DominantResourceCalculator divide and ratio methods 
should consider only countable resource types
 Key: YARN-10691
 URL: https://issues.apache.org/jira/browse/YARN-10691
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-03-09 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298009#comment-17298009
 ] 

Bilwa S T commented on YARN-10588:
--

Hi [~Jim_Brennan]

can you please take a look at this Jira when you get time? Thanks

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch, YARN-10588.004.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2021-03-08 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297905#comment-17297905
 ] 

Bilwa S T commented on YARN-10120:
--

Hi [~brahmareddy] looks like this didn't get merged to branch-3.3 . Can you 
please backport it ? Thanks

> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled
> --
>
> Key: YARN-10120
> URL: https://issues.apache.org/jira/browse/YARN-10120
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Critical
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-10120-YARN-7402.patch, 
> YARN-10120-YARN-7402.v2.patch, YARN-10120-addendum-01.patch, 
> YARN-10120-branch-3.3.patch, YARN-10120-branch-3.3.v2.patch, 
> YARN-10120.001.patch, YARN-10120.002.patch
>
>
> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled.
> yarn.router.webapp.https.address =router ip:8091
> {noformat}
> 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: /cluster/apps
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>   at 
> com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> 

[jira] [Assigned] (YARN-10670) YARN: Opportunistic Container : : In distributed shell job if containers are killed then application is failed. But in this case as containers are killed to make room fo

2021-03-04 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10670:


Assignee: Bilwa S T

> YARN: Opportunistic Container : : In distributed shell job if containers are 
> killed then application is failed. But in this case as containers are killed 
> to make room for guaranteed containers which is not correct to fail an 
> application
> 
>
> Key: YARN-10670
> URL: https://issues.apache.org/jira/browse/YARN-10670
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.1
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
>
> Preconditions:
>  # Secure Hadoop 3.1.1 - 3 Nodes cluster is installed
>  # Set the below parameters  in RM::
>  yarn.resourcemanager.opportunistic-container-allocation.enabled
>  true
>  
>  # Set this in NM[s]: 
>  yarn.nodemanager.opportunistic-containers-max-queue-length
>  30
>  
>  
>  Test Steps:
> Job Command : : yarn 
> org.apache.hadoop.yarn.applications.distributedshell.Client -jar 
> HDFS/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  -shell_command sleep -shell_args 20 -num_containers 20 -container_type 
> OPPORTUNISTIC
> Actual Result: Distributed Shell Yarn Job Failed with below Diagnostics 
> message
> {noformat}
> Attempt recovered after RM restartApplication Failure: desired = 20, 
> completed = 20, allocated = 20, failed = 1, diagnostics = [2021-02-09 
> 22:11:48.440]Container De-queued to meet NM queuing limits.
> [2021-02-09 22:11:48.441]Container terminated before launch.
> {noformat}
>  Expected Result: Distributed Shell Yarn Job should not fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10668) [DS] Disable distributed scheduling when client doesn't configure scheduler address as amrmproxy address

2021-03-03 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10668:


 Summary: [DS] Disable distributed scheduling when client doesn't 
configure scheduler address as amrmproxy address
 Key: YARN-10668
 URL: https://issues.apache.org/jira/browse/YARN-10668
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T


In distributed scheduling setup if client wants to submit application with 
normal client conf ie scheduler address not same as amrmproxyaddress , then 
application fails with Invalid AMRMToken. So i think distributed scheduling 
should be disabled and job should be executed with opportunistic containers 
enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-03-03 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294998#comment-17294998
 ] 

Bilwa S T commented on YARN-10588:
--

[~epayne]

I have updated the patch. Please take a look at it. Thanks

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch, YARN-10588.004.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-03-03 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10588:
-
Attachment: YARN-10588.004.patch

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch, YARN-10588.004.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10667) The current logic only sets the subdirectory of nm-aux-services to 700, but does not set nm-aux-services dir.

2021-03-03 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10667:


Assignee: Bilwa S T

> The current logic only sets the subdirectory of nm-aux-services to 700, but 
> does not set  nm-aux-services dir.
> --
>
> Key: YARN-10667
> URL: https://issues.apache.org/jira/browse/YARN-10667
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Major
> Attachments: Permission 755.PNG
>
>
> Current code logic only sets the subdirectory of nm-aux-services to 700, but 
> does not set  nm-aux-services dir.
> The permissions of some files and directories in the yarn deployment node are 
> 755.
>  !Permission 755.PNG! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9017) PlacementRule order is not maintained in CS

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286510#comment-17286510
 ] 

Bilwa S T commented on YARN-9017:
-

[~brahmareddy] please cherry-pick this to branch 3.3.1.  Thanks

> PlacementRule order is not maintained in CS
> ---
>
> Key: YARN-9017
> URL: https://issues.apache.org/jira/browse/YARN-9017
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Bibin Chundatt
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9017.001.patch, YARN-9017.002.patch, 
> YARN-9017.003.patch
>
>
> {{yarn.scheduler.queue-placement-rules}} doesn't work as expected in Capacity 
> Scheduler
> {quote}
> * **Queue Mapping Interface based on Default or User Defined Placement 
> Rules** - This feature allows users to map a job to a specific queue based on 
> some default placement rule. For instance based on user & group, or 
> application name. User can also define their own placement rule.
> {quote}
> As per current UserGroupMapping is always added in placementRule. 
> {{CapacityScheduler#updatePlacementRules}}
> {code}
> // Initialize placement rules
> Collection placementRuleStrs = conf.getStringCollection(
> YarnConfiguration.QUEUE_PLACEMENT_RULES);
> List placementRules = new ArrayList<>();
> ...
> // add UserGroupMappingPlacementRule if absent
> distingushRuleSet.add(YarnConfiguration.USER_GROUP_PLACEMENT_RULE);
> {code}
> PlacementRule configuration order is not maintained 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286507#comment-17286507
 ] 

Bilwa S T commented on YARN-9606:
-

[~brahmareddy] please cherry-pick this to branch 3.3.1.  Thanks

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9301) Too many InvalidStateTransitionException with SLS

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286509#comment-17286509
 ] 

Bilwa S T commented on YARN-9301:
-

[~brahmareddy] please cherry-pick this to branch 3.3.1.  Thanks

> Too many InvalidStateTransitionException with SLS
> -
>
> Key: YARN-9301
> URL: https://issues.apache.org/jira/browse/YARN-9301
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: Bilwa S T
>Priority: Major
>  Labels: simulator
> Fix For: 3.4.0
>
> Attachments: YARN-9301-001.patch, YARN-9301.002.patch
>
>
> Too many InvalidStateTransistionExcetion
> {noformat}
> 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Can't handle this event 
> at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> LAUNCHED at RUNNING
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:483)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl.handle(RMContainerImpl.java:65)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.containerLaunchedOnNode(SchedulerApplicationAttempt.java:655)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.containerLaunchedOnNode(AbstractYarnScheduler.java:359)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNewContainerInfo(AbstractYarnScheduler.java:1010)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.nodeUpdate(AbstractYarnScheduler.java:1112)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1295)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1752)
> at 
> org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:205)
> at 
> org.apache.hadoop.yarn.sls.scheduler.SLSCapacityScheduler.handle(SLSCapacityScheduler.java:60)
> at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
> at java.lang.Thread.run(Thread.java:745)
> 19/02/13 17:44:43 ERROR rmcontainer.RMContainerImpl: Invalid event LAUNCHED 
> on container container_1550059705491_0067_01_01
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8942) PriorityBasedRouterPolicy throws exception if all sub-cluster weights have negative value

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286506#comment-17286506
 ] 

Bilwa S T commented on YARN-8942:
-

[~brahmareddy] please cherry-pick this to branch 3.3.1.  Thanks

> PriorityBasedRouterPolicy throws exception if all sub-cluster weights have 
> negative value
> -
>
> Key: YARN-8942
> URL: https://issues.apache.org/jira/browse/YARN-8942
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Akshay Agarwal
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8942.001.patch, YARN-8942.002.patch
>
>
> In *PriorityBasedRouterPolicy* if all sub-cluster weights are *set to 
> negative values* it is throwing exception while running a job.
> Ideally it should handle the negative priority as well according to the home 
> sub cluster selection process of the policy.
>  *Exception Details:*
> {code:java}
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Unable 
> to insert the ApplicationId application_1540356760422_0015 into the 
> FederationStateStore
> at 
> org.apache.hadoop.yarn.server.router.RouterServerUtil.logAndThrowException(RouterServerUtil.java:56)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:418)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.submitApplication(RouterClientRMService.java:218)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:282)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:579)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Caused by: 
> org.apache.hadoop.yarn.server.federation.store.exception.FederationStateStoreInvalidInputException:
>  Missing SubCluster Id information. Please try again by specifying Subcluster 
> Id information.
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationMembershipStateStoreInputValidator.checkSubClusterId(FederationMembershipStateStoreInputValidator.java:247)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.checkApplicationHomeSubCluster(FederationApplicationHomeSubClusterStoreInputValidator.java:160)
> at 
> org.apache.hadoop.yarn.server.federation.store.utils.FederationApplicationHomeSubClusterStoreInputValidator.validate(FederationApplicationHomeSubClusterStoreInputValidator.java:65)
> at 
> org.apache.hadoop.yarn.server.federation.store.impl.ZookeeperFederationStateStore.addApplicationHomeSubCluster(ZookeeperFederationStateStore.java:159)
> at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy84.addApplicationHomeSubCluster(Unknown Source)
> at 
> org.apache.hadoop.yarn.server.federation.utils.FederationStateStoreFacade.addApplicationHomeSubCluster(FederationStateStoreFacade.java:402)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.submitApplication(FederationClientInterceptor.java:413)
> ... 11 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (YARN-10359) Log container report only if list is not empty

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286504#comment-17286504
 ] 

Bilwa S T commented on YARN-10359:
--

[~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks

> Log container report only if list is not empty
> --
>
> Key: YARN-10359
> URL: https://issues.apache.org/jira/browse/YARN-10359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-10359.001.patch, YARN-10359.002.patch
>
>
> In NodeStatusUpdaterImpl print log only if containerReports list is  not empty
> {code:java}
> if (containerReports != null) {
> LOG.info("Registering with RM using containers :" + containerReports);
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10364) Absolute Resource [memory=0] is considered as Percentage config type

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286503#comment-17286503
 ] 

Bilwa S T commented on YARN-10364:
--

[~brahmareddy] please cherry-pick this to branch 3.3.1. Thanks

> Absolute Resource [memory=0] is considered as Percentage config type
> 
>
> Key: YARN-10364
> URL: https://issues.apache.org/jira/browse/YARN-10364
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Prabhu Joseph
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10364.001.patch, YARN-10364.002.patch, 
> YARN-10364.003.patch
>
>
> Absolute Resource [memory=0] is considered as Percentage config type. This 
> causes failure while converting queues from Percentage to Absolute Resources 
> automatically. 
> *Repro:*
> 1. Queue A = 100% and child queues Queue A.B = 0%, A.C=100%
> 2. While converting above to absolute resource automatically, capacity of 
> queue A = [memory=], A.B = [memory=0]
> This fails with below as A is considered as Absolute Resource whereas B is 
> considered as Percentage config type.
> {code}
> 2020-07-23 09:36:40,499 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: 
> CapacityScheduler configuration validation failed:java.io.IOException: Failed 
> to re-init queues : Parent queue 'root.A' and child queue 'root.A.B' should 
> use either percentage based capacityconfiguration or absolute resource 
> together for label:
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286446#comment-17286446
 ] 

Bilwa S T edited comment on YARN-10588 at 2/18/21, 12:27 PM:
-

[~Jim_Brennan] [~epayne]

Changing *DominantResourceCalculator#isInvalidDivisor* to 
*DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do 
you think?
{quote} Currently it returns true if any resource is zero, while {{divide}} is 
only going to return zero if all of the countable ones are zero.
{quote}
 


was (Author: bilwast):
[~Jim_Brennan] [~epayne]

Changing *DominantResourceCalculator#isInvalidDivisor* to ** 
*DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do 
you think?
{quote} Currently it returns true if any resource is zero, while {{divide}} is 
only going to return zero if all of the countable ones are zero.
{quote}
 

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286446#comment-17286446
 ] 

Bilwa S T edited comment on YARN-10588 at 2/18/21, 12:27 PM:
-

[~Jim_Brennan] [~epayne]

Changing *DominantResourceCalculator#isInvalidDivisor* to ** 
*DominantResourceCalculator#isAllInvalidDivisor* would solve problem. What do 
you think?
{quote} Currently it returns true if any resource is zero, while {{divide}} is 
only going to return zero if all of the countable ones are zero.
{quote}
 


was (Author: bilwast):
[~Jim_Brennan] [~epayne]

Changing *DominantResourceCalculator#isInvalidDivisor* to ** 
*DominantResourceCalculator#isAllInvalidDivisor* would ** solve problem. What 
do you think?**
{quote} Currently it returns true if any resource is zero, while {{divide}} is 
only going to return zero if all of the countable ones are zero.
{quote}
 

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286446#comment-17286446
 ] 

Bilwa S T commented on YARN-10588:
--

[~Jim_Brennan] [~epayne]

Changing *DominantResourceCalculator#isInvalidDivisor* to ** 
*DominantResourceCalculator#isAllInvalidDivisor* would ** solve problem. What 
do you think?**
{quote} Currently it returns true if any resource is zero, while {{divide}} is 
only going to return zero if all of the countable ones are zero.
{quote}
 

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10634) The config parameter "mapreduce.job.num-opportunistic-maps-percent" is confusing when requesting Opportunistic containers in YARN job

2021-02-18 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10634:


Assignee: Bilwa S T

> The config parameter "mapreduce.job.num-opportunistic-maps-percent" is 
> confusing when requesting Opportunistic containers in YARN job
> -
>
> Key: YARN-10634
> URL: https://issues.apache.org/jira/browse/YARN-10634
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Minor
>
> Execute the below job by Passing this config 
> -Dmapreduce.job.num-opportunistic-maps-percent ,which actually represents the 
> number of containers to be launched as Opportunistic, not in % of the total 
> mappers requested , i think this configuration name should be modified 
> accordingly and also {color:#de350b}the same gets printed in AM logs{color}
> Job Command: hadoop jar 
> HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  pi -{color:#de350b}Dmapreduce.job.num-opportunistic-maps-percent{color}="20" 
> 20 99
> In AM logs this message is displayed. it should be {color:#de350b}20 , not 
> 20% {color}? 
>  “2021-02-10 20:23:23,023 | INFO | main | {color:#de350b}20% of the 
> mappers{color} will be scheduled using OPPORTUNISTIC containers | 
> RMContainerAllocator.java:257”
> Job Command: hadoop jar 
> HDFS/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1-hw-ei-310001-SNAPSHOT.jar
>  pi 
> {color:#de350b}-Dmapreduce.job.num-opportunistic-maps-percent{color}="100" 20 
> 99
> In AM logs this message is displayed. It should be {color:#de350b}100, not 
> 100%{color} ?
> 2021-02-10 20:28:16,016 | INFO  | main | {color:#de350b}100% of the 
> mapper{color}s will be scheduled using OPPORTUNISTIC containers | 
> RMContainerAllocator.java:257



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8047) RMWebApp make external class pluggable

2021-02-18 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17286357#comment-17286357
 ] 

Bilwa S T commented on YARN-8047:
-

Hi [~brahma] can you please cherry-pick this Jira to 3.3.1 ? Thanks

> RMWebApp make external class pluggable
> --
>
> Key: YARN-8047
> URL: https://issues.apache.org/jira/browse/YARN-8047
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin Chundatt
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: YARN-8047-001.patch, YARN-8047-002.patch, 
> YARN-8047-003.patch, YARN-8047.004.patch, YARN-8047.005.patch, 
> YARN-8047.006.patch
>
>
> JIra should make sure we should be able to plugin webservices and web pages 
> of scheduler in Resourcemanager
> * RMWebApp allow to bind external classes
> * RMController allow to plugin scheduler classes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-02-17 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17285697#comment-17285697
 ] 

Bilwa S T commented on YARN-10258:
--

Thank you [~gb.ana...@gmail.com] for your contribution. Patch LGTM. there are 
few checkstyle issues. Please fix. Resubmitting patch to trigger build again

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Attachments: YARN-10258-001.patch, YARN-10258-002.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-02-17 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10258:
-
Attachment: YARN-10258-002.patch

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Attachments: YARN-10258-001.patch, YARN-10258-002.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-02-16 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10258:
-
Target Version/s:   (was: 3.1.3)

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Attachments: YARN-10258-001.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-02-16 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10258:
-
Fix Version/s: (was: 3.1.3)

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Attachments: YARN-10258-001.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-02-16 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10258:
-
Comment: was deleted

(was: Thank you  [~gb.ana...@gmail.com] for working on this. Looks there are 
some checkstyle issues. other than that patch LGTM)

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Fix For: 3.1.3
>
> Attachments: YARN-10258-001.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager

2021-02-16 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17285691#comment-17285691
 ] 

Bilwa S T commented on YARN-10258:
--

Thank you  [~gb.ana...@gmail.com] for working on this. Looks there are some 
checkstyle issues. other than that patch LGTM

> Add metrics for 'ApplicationsRunning' in NodeManager
> 
>
> Key: YARN-10258
> URL: https://issues.apache.org/jira/browse/YARN-10258
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.1.3
>Reporter: ANANDA G B
>Assignee: ANANDA G B
>Priority: Minor
> Fix For: 3.1.3
>
> Attachments: YARN-10258-001.patch
>
>
> Add metrics for 'ApplicationsRunning' in NodeManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-02-12 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282988#comment-17282988
 ] 

Bilwa S T edited comment on YARN-10588 at 2/12/21, 8:23 AM:


[~epayne]

Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of 
*DominantResourceCalculator#divide* is nothing but returning true only if all 
resource value is *0*. We already have a method called 
*DominantResourceCalculator#isAllInvalidDivisor* which will return true only if 
all resources are *zero*. I think we can just change isInvalidDivisor to 
isAllInvalidDivisor.  Correct me if i am wrong


was (Author: bilwast):
[~epayne]

Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of 
*DominantResourceCalculator#divide* is nothing but returning true only if all 
resource value is *0*. We already have a method called 
*DominantResourceCalculator#isAllInvalidDivisor* which will return true only if 
all resources are *zero*. I think we can just change isInvalidDivisor to 
isAllInvalidDivisor. 

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-02-11 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282988#comment-17282988
 ] 

Bilwa S T commented on YARN-10588:
--

[~epayne]

Modifying *DominantResourceCalculator#isInvalidDivisor* to match logic of 
*DominantResourceCalculator#divide* is nothing but returning true only if all 
resource value is *0*. We already have a method called 
*DominantResourceCalculator#isAllInvalidDivisor* which will return true only if 
all resources are *zero*. I think we can just change isInvalidDivisor to 
isAllInvalidDivisor. 

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9927) RM multi-thread event processing mechanism

2021-02-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-9927:
---

Assignee: Bilwa S T

> RM multi-thread event processing mechanism
> --
>
> Key: YARN-9927
> URL: https://issues.apache.org/jira/browse/YARN-9927
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0, 2.9.2
>Reporter: hcarrot
>Assignee: Bilwa S T
>Priority: Major
> Attachments: RM multi-thread event processing mechanism.pdf, 
> YARN-9927.001.patch
>
>
> Recently, we have observed serious event blocking in RM event dispatcher 
> queue. After analysis of RM event monitoring data and RM event processing 
> logic, we found that
> 1) environment: a cluster with thousands of nodes
> 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler
> 3) Meanwhile, RM event processing is in a single-thread mode, and It results 
> in the low headroom of RM event scheduler, thus performance of RM.
> So we proposed a RM multi-thread event processing mechanism to improve RM 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-02-10 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282328#comment-17282328
 ] 

Bilwa S T commented on YARN-10588:
--

Hi [~epayne]

I added change in FicaSchedulerApp.java as same issue can occur ie cluster and 
queue resource will not be calculated if one of the resource is zero. I added 
instanceOf check because that method is applicable only for capacityscheduler . 
Many testcases were failing once i removed 
DominantResourceCalculator.isInvalidDivisor() check as testcases had configured 
Fifoscheduler.  

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10588) Percentage of queue and cluster is zero in WebUI

2021-02-10 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17282328#comment-17282328
 ] 

Bilwa S T edited comment on YARN-10588 at 2/10/21, 9:19 AM:


Thanks  [~epayne] [~Jim_Brennan] for taking a look at this issue.
 
I added change in FicaSchedulerApp.java as same issue can occur ie cluster and 
queue resource will not be calculated if one of the resource is zero. I added 
instanceOf check because that method is applicable only for capacityscheduler . 
Many testcases were failing once i removed 
DominantResourceCalculator.isInvalidDivisor() check as testcases had configured 
Fifoscheduler.  


was (Author: bilwast):
Hi [~epayne]

I added change in FicaSchedulerApp.java as same issue can occur ie cluster and 
queue resource will not be calculated if one of the resource is zero. I added 
instanceOf check because that method is applicable only for capacityscheduler . 
Many testcases were failing once i removed 
DominantResourceCalculator.isInvalidDivisor() check as testcases had configured 
Fifoscheduler.  

> Percentage of queue and cluster is zero in WebUI 
> -
>
> Key: YARN-10588
> URL: https://issues.apache.org/jira/browse/YARN-10588
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10588.001.patch, YARN-10588.002.patch, 
> YARN-10588.003.patch
>
>
> Steps to reproduce:
> Configure below property in resource-types.xml
> {code:java}
> 
>  yarn.resource-types
>  yarn.io/gpu
>  {code}
> Submit a job
> In UI you can see % Of Queue and % Of Cluster is zero for the submitted 
> application
>  
> This is because in SchedulerApplicationAttempt has below check for 
> calculating queueUsagePerc and clusterUsagePerc
> {code:java}
> if (!calc.isInvalidDivisor(cluster)) {
> float queueCapacityPerc = queue.getQueueInfo(false, false)
> .getCapacity();
> queueUsagePerc = calc.divide(cluster, usedResourceClone,
> Resources.multiply(cluster, queueCapacityPerc)) * 100;
> if (Float.isNaN(queueUsagePerc) || Float.isInfinite(queueUsagePerc)) {
>   queueUsagePerc = 0.0f;
> }
> clusterUsagePerc =
> calc.divide(cluster, usedResourceClone, cluster) * 100;
>   }
> {code}
> calc.isInvalidDivisor(cluster) always returns true as gpu resource is 0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >