[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.
[ https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193663#comment-17193663 ] Anuj commented on YARN-10368: - [~Amithsha] we have modified yarn code to ignore the log aggregation status while removing an completed app. > Log aggregation reset to NOT_START after RM restart. > > > Key: YARN-10368 > URL: https://issues.apache.org/jira/browse/YARN-10368 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, yarn >Affects Versions: 3.2.1 >Reporter: Anuj >Priority: Major > Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png > > > Attempt recovered after RM restart the log aggregation status is not > preserved and it come to NOT_START. > From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App > in memory resulting max-completed-app in memory limit hit and RM stops > accepting new apps. > https://issues.apache.org/jira/browse/YARN-7952 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.
[ https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180448#comment-17180448 ] Anuj commented on YARN-10368: - For now I have removed the check for log aggregation while cleanup. > Log aggregation reset to NOT_START after RM restart. > > > Key: YARN-10368 > URL: https://issues.apache.org/jira/browse/YARN-10368 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, yarn >Affects Versions: 3.2.1 >Reporter: Anuj >Priority: Major > Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png > > > Attempt recovered after RM restart the log aggregation status is not > preserved and it come to NOT_START. > From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App > in memory resulting max-completed-app in memory limit hit and RM stops > accepting new apps. > https://issues.apache.org/jira/browse/YARN-7952 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10368) Log aggregation reset to NOT_START after RM restart.
[ https://issues.apache.org/jira/browse/YARN-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165558#comment-17165558 ] Anuj commented on YARN-10368: - [~xgong] can you please help me with this. > Log aggregation reset to NOT_START after RM restart. > > > Key: YARN-10368 > URL: https://issues.apache.org/jira/browse/YARN-10368 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, yarn >Affects Versions: 3.2.1 >Reporter: Anuj >Priority: Major > Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png > > > Attempt recovered after RM restart the log aggregation status is not > preserved and it come to NOT_START. > From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App > in memory resulting max-completed-app in memory limit hit and RM stops > accepting new apps. > https://issues.apache.org/jira/browse/YARN-7952 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10368) Log aggregation reset to NOT_START after RM restart.
Anuj created YARN-10368: --- Summary: Log aggregation reset to NOT_START after RM restart. Key: YARN-10368 URL: https://issues.apache.org/jira/browse/YARN-10368 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager, yarn Affects Versions: 3.2.1 Reporter: Anuj Attachments: Screenshot 2020-07-27 at 2.35.15 PM.png Attempt recovered after RM restart the log aggregation status is not preserved and it come to NOT_START. >From NOT_START it never moves to TIMED_OUT and then never cleaned up RM App in >memory resulting max-completed-app in memory limit hit and RM stops accepting >new apps. https://issues.apache.org/jira/browse/YARN-7952 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9088) Non-exclusive labels break QueueMetrics
[ https://issues.apache.org/jira/browse/YARN-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062390#comment-17062390 ] Anuj commented on YARN-9088: We are in our setup facing similar issue in which global view of pending and available resource is get messed up. > Non-exclusive labels break QueueMetrics > --- > > Key: YARN-9088 > URL: https://issues.apache.org/jira/browse/YARN-9088 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.8.5 >Reporter: Brandon Scheller >Priority: Major > Labels: metrics, nodelabel > > QueueMetrics are broken (random/negative values) when non-exclusive labels > are being used and unlabeled containers run on labeled nodes. > This is caused by the change in the patch here: > https://issues.apache.org/jira/browse/YARN-6467 > It assumes that a container's label will be the same as the node's label that > it is running on. > If you look within the patch, sometimes metrics are updated using the > request.getNodeLabelExpression(). And sometimes they are updated using > node.getPartition(). > This means that in the case where the node is labeled while the container > request isn't, these metrics only get updated when referring to the default > queue. This stops metrics from balancing out and results in incorrect and > negative values in QueueMetrics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9932) Nodelabel support for Fair Scheduler
Anuj created YARN-9932: -- Summary: Nodelabel support for Fair Scheduler Key: YARN-9932 URL: https://issues.apache.org/jira/browse/YARN-9932 Project: Hadoop YARN Issue Type: New Feature Components: fairscheduler, nodemanager, resourcemanager Affects Versions: 3.2.1 Reporter: Anuj Currently Node labels only work capacity scheduler. We would like to have this working with Fair Scheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2497) Fair scheduler should support strict node labels
[ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926720#comment-16926720 ] Anuj commented on YARN-2497: Can we use this patch and apply with hadoop 3.0? > Fair scheduler should support strict node labels > > > Key: YARN-2497 > URL: https://issues.apache.org/jira/browse/YARN-2497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Wangda Tan >Assignee: Daniel Templeton >Priority: Major > Attachments: YARN-2497.001.patch, YARN-2497.002.patch, > YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch, > YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch, > YARN-2497.009.patch, YARN-2497.010.patch, YARN-2497.011.patch, > YARN-2497.branch-3.0.001.patch, YARN-2499.WIP01.patch > > -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController
[ https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj updated YARN-9241: --- Description: RmController contains a hardcoded if and else block for type of scheduler and decides which page to use for which scheduler. [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com/] This if else block makes it hard to introduce a new scheduler and corresponding webpage without modifying the existing RMController class. It would be great if we make it extendable. was: RmController contains a hardcoded if and else block for type of scheduler and decides which page to use for which scheduler. [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com] This if else block makes it hard to introduce a new scheduler and corresponding webpage with modifying the existing RMController class. It would be great if we make it extendable. > Remove Scheduler specific if/else block and make it injectable in RMController > -- > > Key: YARN-9241 > URL: https://issues.apache.org/jira/browse/YARN-9241 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Anuj >Priority: Minor > > RmController contains a hardcoded if and else block for type of scheduler and > decides which page to use for which scheduler. > [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com/] > This if else block makes it hard to introduce a new scheduler and > corresponding webpage without modifying the existing RMController class. > It would be great if we make it extendable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController
[ https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj updated YARN-9241: --- Description: RmController contains a hardcoded if and else block for type of scheduler and decides which page to use for which scheduler. [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com] This if else block makes it hard to introduce a new scheduler and corresponding webpage with modifying the existing RMController class. It would be great if we make it extendable. was: RmController contains a hardcoded if and else block for type of scheduler and decides which page to use for which scheduler. This if else block makes it hard to introduce a new scheduler and corresponding webpage with modifying the existing RMController class. It would be great if we make it extendable. > Remove Scheduler specific if/else block and make it injectable in RMController > -- > > Key: YARN-9241 > URL: https://issues.apache.org/jira/browse/YARN-9241 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Anuj >Priority: Minor > > RmController contains a hardcoded if and else block for type of scheduler and > decides which page to use for which scheduler. > [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java|http://example.com] > This if else block makes it hard to introduce a new scheduler and > corresponding webpage with modifying the existing RMController class. > It would be great if we make it extendable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController
[ https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj updated YARN-9241: --- Summary: Remove Scheduler specific if/else block and make it injectable in RMController (was: Remove if else block from RmController.java) > Remove Scheduler specific if/else block and make it injectable in RMController > -- > > Key: YARN-9241 > URL: https://issues.apache.org/jira/browse/YARN-9241 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Anuj >Priority: Minor > > RmController contains a hardcoded if and else block for type of scheduler and > decides which page to use for which scheduler. > This if else block makes it hard to introduce a new scheduler and > corresponding webpage with modifying the existing RMController class. > It would be great if we make it extendable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9241) Remove Scheduler specific if/else block and make it injectable in RMController
[ https://issues.apache.org/jira/browse/YARN-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16753497#comment-16753497 ] Anuj commented on YARN-9241: Updated. > Remove Scheduler specific if/else block and make it injectable in RMController > -- > > Key: YARN-9241 > URL: https://issues.apache.org/jira/browse/YARN-9241 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.2.0 >Reporter: Anuj >Priority: Minor > > RmController contains a hardcoded if and else block for type of scheduler and > decides which page to use for which scheduler. > This if else block makes it hard to introduce a new scheduler and > corresponding webpage with modifying the existing RMController class. > It would be great if we make it extendable. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org