[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.

2020-09-10 Thread Anuj (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193663#comment-17193663
 ] 

Anuj commented on YARN-10368:
-

[~Amithsha] we have modified yarn code to ignore the log aggregation status 
while removing an completed app.

> Log aggregation reset to NOT_START after RM restart.
> 
>
> Key: YARN-10368
> URL: https://issues.apache.org/jira/browse/YARN-10368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Anuj
>Priority: Major
> Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png
>
>
> Attempt recovered after RM restart the log aggregation status is not 
> preserved and it come to NOT_START.
> From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App 
> in memory resulting max-completed-app in memory limit hit and RM stops 
> accepting new apps.
> https://issues.apache.org/jira/browse/YARN-7952



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.

2020-08-19 Thread Anuj (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17180448#comment-17180448
 ] 

Anuj commented on YARN-10368:
-

For now I have removed the check for log aggregation while cleanup.

> Log aggregation reset to NOT_START after RM restart.
> 
>
> Key: YARN-10368
> URL: https://issues.apache.org/jira/browse/YARN-10368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Anuj
>Priority: Major
> Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png
>
>
> Attempt recovered after RM restart the log aggregation status is not 
> preserved and it come to NOT_START.
> From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App 
> in memory resulting max-completed-app in memory limit hit and RM stops 
> accepting new apps.
> https://issues.apache.org/jira/browse/YARN-7952



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.

2020-07-27 Thread Anuj (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165558#comment-17165558
 ] 

Anuj commented on YARN-10368:
-

[~xgong] can you please help me with this.

> Log aggregation reset to NOT_START after RM restart.
> 
>
> Key: YARN-10368
> URL: https://issues.apache.org/jira/browse/YARN-10368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Anuj
>Priority: Major
> Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png
>
>
> Attempt recovered after RM restart the log aggregation status is not 
> preserved and it come to NOT_START.
> From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App 
> in memory resulting max-completed-app in memory limit hit and RM stops 
> accepting new apps.
> https://issues.apache.org/jira/browse/YARN-7952



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10368) Log aggregation reset to NOT_START after RM restart.

2020-07-27 Thread Anuj (Jira)
Anuj created YARN-10368:
---

 Summary: Log aggregation reset to NOT_START after RM restart.
 Key: YARN-10368
 URL: https://issues.apache.org/jira/browse/YARN-10368
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager, yarn
Affects Versions: 3.2.1
Reporter: Anuj
 Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png

Attempt recovered after RM restart the log aggregation status is not preserved 
and it come to NOT_START.

>From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App in 
>memory resulting max-completed-app in memory limit hit and RM stops accepting 
>new apps.

https://issues.apache.org/jira/browse/YARN-7952



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9088) Non-exclusive labels break QueueMetrics

2020-03-19 Thread Anuj (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062390#comment-17062390
 ] 

Anuj commented on YARN-9088:


We are in our setup facing similar issue in which global view of pending and 
available resource is get messed up.

> Non-exclusive labels break QueueMetrics
> ---
>
> Key: YARN-9088
> URL: https://issues.apache.org/jira/browse/YARN-9088
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.8.5
>Reporter: Brandon Scheller
>Priority: Major
>  Labels: metrics, nodelabel
>
> QueueMetrics are broken (random/negative values) when non-exclusive labels 
> are being used and unlabeled containers run on labeled nodes.
> This is caused by the change in the patch here:
> https://issues.apache.org/jira/browse/YARN-6467
> It assumes that a container's label will be the same as the node's label that 
> it is running on.
> If you look within the patch, sometimes metrics are updated using the 
> request.getNodeLabelExpression(). And sometimes they are updated using 
> node.getPartition().
> This means that in the case where the node is labeled while the container 
> request isn't, these metrics only get updated when referring to the default 
> queue. This stops metrics from balancing out and results in incorrect and 
> negative values in QueueMetrics. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9932) Nodelabel support for Fair Scheduler

2019-10-23 Thread Anuj (Jira)
Anuj created YARN-9932:
--

 Summary: Nodelabel support for Fair Scheduler
 Key: YARN-9932
 URL: https://issues.apache.org/jira/browse/YARN-9932
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: fairscheduler, nodemanager, resourcemanager
Affects Versions: 3.2.1
Reporter: Anuj


Currently Node labels only work capacity scheduler.

We would like to have this working with Fair Scheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2497) Fair scheduler should support strict node labels

2019-09-10 Thread Anuj (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926720#comment-16926720
 ] 

Anuj commented on YARN-2497:


Can we use this patch and apply with hadoop 3.0?

> Fair scheduler should support strict node labels
> 
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wangda Tan
>Assignee: Daniel Templeton
>Priority: Major
> Attachments: YARN-2497.001.patch, YARN-2497.002.patch, 
> YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch, 
> YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch, 
> YARN-2497.009.patch, YARN-2497.010.patch, YARN-2497.011.patch, 
> YARN-2497.branch-3.0.001.patch, YARN-2499.WIP01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController

2019-01-27 Thread Anuj (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj updated YARN-9241:
---
Description: 
RmController contains a hardcoded if and else block for type of scheduler and 
decides which page to use for which scheduler.

[https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com/]

This if else block makes it hard to introduce a new scheduler and corresponding 
webpage without modifying the existing RMController class.

It would be great if we make it extendable.

  was:
RmController contains a hardcoded if and else block for type of scheduler and 
decides which page to use for which scheduler.

[https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com]

This if else block makes it hard to introduce a new scheduler and corresponding 
webpage with modifying the existing RMController class.

It would be great if we make it extendable.


> Remove Scheduler specific if/else block and make it injectable in RMController
> --
>
> Key: YARN-9241
> URL: https://issues.apache.org/jira/browse/YARN-9241
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: Anuj
>Priority: Minor
>
> RmController contains a hardcoded if and else block for type of scheduler and 
> decides which page to use for which scheduler.
> [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com/]
> This if else block makes it hard to introduce a new scheduler and 
> corresponding webpage without modifying the existing RMController class.
> It would be great if we make it extendable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController

2019-01-27 Thread Anuj (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj updated YARN-9241:
---
Description: 
RmController contains a hardcoded if and else block for type of scheduler and 
decides which page to use for which scheduler.

[https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com]

This if else block makes it hard to introduce a new scheduler and corresponding 
webpage with modifying the existing RMController class.

It would be great if we make it extendable.

  was:
RmController contains a hardcoded if and else block for type of scheduler and 
decides which page to use for which scheduler.

This if else block makes it hard to introduce a new scheduler and corresponding 
webpage with modifying the existing RMController class.

It would be great if we make it extendable.


> Remove Scheduler specific if/else block and make it injectable in RMController
> --
>
> Key: YARN-9241
> URL: https://issues.apache.org/jira/browse/YARN-9241
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: Anuj
>Priority: Minor
>
> RmController contains a hardcoded if and else block for type of scheduler and 
> decides which page to use for which scheduler.
> [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com]
> This if else block makes it hard to introduce a new scheduler and 
> corresponding webpage with modifying the existing RMController class.
> It would be great if we make it extendable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController

2019-01-27 Thread Anuj (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj updated YARN-9241:
---
Summary: Remove Scheduler specific if/else block and make it injectable in 
RMController  (was: Remove if else block from RmController.java)

> Remove Scheduler specific if/else block and make it injectable in RMController
> --
>
> Key: YARN-9241
> URL: https://issues.apache.org/jira/browse/YARN-9241
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: Anuj
>Priority: Minor
>
> RmController contains a hardcoded if and else block for type of scheduler and 
> decides which page to use for which scheduler.
> This if else block makes it hard to introduce a new scheduler and 
> corresponding webpage with modifying the existing RMController class.
> It would be great if we make it extendable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController

2019-01-27 Thread Anuj (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753497#comment-16753497
 ] 

Anuj commented on YARN-9241:


Updated.

> Remove Scheduler specific if/else block and make it injectable in RMController
> --
>
> Key: YARN-9241
> URL: https://issues.apache.org/jira/browse/YARN-9241
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: Anuj
>Priority: Minor
>
> RmController contains a hardcoded if and else block for type of scheduler and 
> decides which page to use for which scheduler.
> This if else block makes it hard to introduce a new scheduler and 
> corresponding webpage with modifying the existing RMController class.
> It would be great if we make it extendable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org