[jira] [Commented] (YARN-11115) Add configuration to disable AM preemption for capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-5?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541786#comment-17541786 ] Junfan Zhang commented on YARN-5: - Sorry for late reply. Feel free to take it. [~groot] Looking forward your patch. > Add configuration to disable AM preemption for capacity scheduler > - > > Key: YARN-5 > URL: https://issues.apache.org/jira/browse/YARN-5 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Yuan Luo >Assignee: Ashutosh Gupta >Priority: Major > > I think it's necessary to add configuration to disable AM preemption for > capacity-scheduler, like fair-scheduler feature: YARN-9537. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11137) Improve log message in FederationClientInterceptor
[ https://issues.apache.org/jira/browse/YARN-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541759#comment-17541759 ] Íñigo Goiri commented on YARN-11137: Thanks [~slfan1989] for the pull request. Merged PR 4336 to trunk. > Improve log message in FederationClientInterceptor > -- > > Key: YARN-11137 > URL: https://issues.apache.org/jira/browse/YARN-11137 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: fanshilun >Assignee: fanshilun >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: YARN-11137.01.patch, YARN-11137.02.patch > > Time Spent: 3h 20m > Remaining Estimate: 0h > > While reading the relevant yarn-federation-router's code, I found the > following issues with log method in FederationClientInterceptor: > The log methods are inconsistent, some use the splicing method, and some use > the placeholder method,as follows: > org.apache.hadoop.yarn.server.router.clientrmsubmit.FederationClientInterceptor#getNewApplication > {code:java} > for (int i = 0; i < numSubmitRetries; ++i) { > SubClusterId subClusterId = > getRandomActiveSubCluster(subClustersActive); > LOG.debug( > "getNewApplication try #{} on SubCluster {}", i, subClusterId); > ApplicationClientProtocol clientRMProxy = > getClientRMProxyForSubCluster(subClusterId); > ... > }{code} > org.apache.hadoop.yarn.server.router.clientrmsubmit.FederationClientInterceptor#submitApplication > {code:java} > for (int i = 0; i < numSubmitRetries; ++i) { > SubClusterId subClusterId = policyFacade.getHomeSubcluster( > request.getApplicationSubmissionContext(), blacklist); > LOG.info("submitApplication appId" + applicationId + " try #" + i > + " on SubCluster " + subClusterId); >... > } {code} > I think the first way is better. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10087) ATS possible NPE on REST API when data is missing
[ https://issues.apache.org/jira/browse/YARN-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541682#comment-17541682 ] Samrat Deb commented on YARN-10087: --- [~tanu.ajmera] , if you are not actively working on this bug ! can i pick this up ? as a newbie , it would help me go deep dive into code and understand some of working and code working associated with ATS > ATS possible NPE on REST API when data is missing > - > > Key: YARN-10087 > URL: https://issues.apache.org/jira/browse/YARN-10087 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2 >Reporter: Wilfred Spiegelenburg >Assignee: Tanu Ajmera >Priority: Major > Labels: newbie > Attachments: ats_stack.txt > > > If the data stored by the ATS is not complete REST calls to the ATS can > return a NPE instead of results. > {{{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException"}}} > The issue shows up when the ATS was down for a short period and in that time > new applications were started. This causes certain parts of the application > data to be missing in the ATS store. In most cases this is not a problem and > data will be returned but when you start filtering data the filtering fails > throwing the NPE. > In this case the request was for: > {{http://:8188/ws/v1/applicationhistory/apps?user=hive'}} > If certain pieces of data are missing the ATS should not even consider > returning that data, filtered or not. We should not display partial or > incomplete data. > In case of the missing user information ACL checks cannot be correctly > performed and we could see more issues. > A similar issue was fixed in YARN-7118 where the queue details were missing. > It just _skips_ the app to prevent the NPE but that is not the correct thing > when the user is missing -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11072) Can not display hadoop-st.png when using reverse proxy and applying APPLICATION_WEB_PROXY_BASE
[ https://issues.apache.org/jira/browse/YARN-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541678#comment-17541678 ] Samrat Deb commented on YARN-11072: --- hi [~bensonlin321] , are you actively working on this issue ? > Can not display hadoop-st.png when using reverse proxy and applying > APPLICATION_WEB_PROXY_BASE > -- > > Key: YARN-11072 > URL: https://issues.apache.org/jira/browse/YARN-11072 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.4.0 >Reporter: BensonLin >Priority: Trivial > Labels: newbie, pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Scenario: > When I want to use reverse proxy and apply APPLICATION_WEB_PROXY_BASE to > change the base path, it can not find the hadoop-st.png. > (file: > hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/HeaderBlock.java) > Issues: > 1. Cannot display the image when you apply > APPLICATION_WEB_PROXY_BASE=base_path/. > The image path should be /base_path/static/hadoop-st.png. > 2. Should not use a fixed path. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-11115) Add configuration to disable AM preemption for capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-5?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Gupta reassigned YARN-5: - Assignee: Ashutosh Gupta (was: Junfan Zhang) > Add configuration to disable AM preemption for capacity scheduler > - > > Key: YARN-5 > URL: https://issues.apache.org/jira/browse/YARN-5 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Yuan Luo >Assignee: Ashutosh Gupta >Priority: Major > > I think it's necessary to add configuration to disable AM preemption for > capacity-scheduler, like fair-scheduler feature: YARN-9537. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11115) Add configuration to disable AM preemption for capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-5?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541500#comment-17541500 ] Ashutosh Gupta commented on YARN-5: --- Taking it up. > Add configuration to disable AM preemption for capacity scheduler > - > > Key: YARN-5 > URL: https://issues.apache.org/jira/browse/YARN-5 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Yuan Luo >Assignee: Junfan Zhang >Priority: Major > > I think it's necessary to add configuration to disable AM preemption for > capacity-scheduler, like fair-scheduler feature: YARN-9537. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4858) start-yarn and stop-yarn scripts to support timeline and sharedcachemanager
[ https://issues.apache.org/jira/browse/YARN-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541333#comment-17541333 ] Masatake Iwasaki commented on YARN-4858: updated the target version for preparing 2.10.2 release. > start-yarn and stop-yarn scripts to support timeline and sharedcachemanager > --- > > Key: YARN-4858 > URL: https://issues.apache.org/jira/browse/YARN-4858 > Project: Hadoop YARN > Issue Type: Improvement > Components: scripts >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Labels: oct16-easy > Attachments: YARN-4858-001.patch, YARN-4858-branch-2.001.patch > > > The start-yarn and stop-yarn scripts don't have any (even commented out) > support for the timeline and sharedcachemanager > Proposed: > * bash and cmd start-yarn scripts have commented out start actions > * stop-yarn scripts stop the servers. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4858) start-yarn and stop-yarn scripts to support timeline and sharedcachemanager
[ https://issues.apache.org/jira/browse/YARN-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-4858: --- Target Version/s: 2.10.3 (was: 2.10.2) > start-yarn and stop-yarn scripts to support timeline and sharedcachemanager > --- > > Key: YARN-4858 > URL: https://issues.apache.org/jira/browse/YARN-4858 > Project: Hadoop YARN > Issue Type: Improvement > Components: scripts >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Labels: oct16-easy > Attachments: YARN-4858-001.patch, YARN-4858-branch-2.001.patch > > > The start-yarn and stop-yarn scripts don't have any (even commented out) > support for the timeline and sharedcachemanager > Proposed: > * bash and cmd start-yarn scripts have commented out start actions > * stop-yarn scripts stop the servers. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8118) Better utilize gracefully decommissioning node managers
[ https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-8118: --- Target Version/s: 2.10.3 (was: 2.10.2) > Better utilize gracefully decommissioning node managers > --- > > Key: YARN-8118 > URL: https://issues.apache.org/jira/browse/YARN-8118 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.8.2 > Environment: * Google Compute Engine (Dataproc) > * Java 8 > * Hadoop 2.8.2 using client-mode graceful decommissioning >Reporter: Karthik Palaniappan >Assignee: Karthik Palaniappan >Priority: Major > Attachments: YARN-8118-branch-2.001.patch > > > Proposal design doc with background + details (please comment directly on > doc): > [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7] > tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications > to complete before shutting down, but they cannot run new containers from > those in-progress applications. This is wasteful, particularly in > environments where you are billed by resource usage (e.g. EC2). > Proposal: YARN should schedule containers from in-progress applications on > DECOMMISSIONING nodes, but should still avoid scheduling containers from new > applications. That will make in-progress applications complete faster and let > nodes decommission faster. Overall, this should be cheaper. > I have a working patch without unit tests that's surprisingly just a few real > lines of code (patch 001). If folks are happy with the proposal, I'll write > unit tests and also write a patch targeted at trunk. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers
[ https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541330#comment-17541330 ] Masatake Iwasaki commented on YARN-8118: updated the target version for preparing 2.10.2 release. > Better utilize gracefully decommissioning node managers > --- > > Key: YARN-8118 > URL: https://issues.apache.org/jira/browse/YARN-8118 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.8.2 > Environment: * Google Compute Engine (Dataproc) > * Java 8 > * Hadoop 2.8.2 using client-mode graceful decommissioning >Reporter: Karthik Palaniappan >Assignee: Karthik Palaniappan >Priority: Major > Attachments: YARN-8118-branch-2.001.patch > > > Proposal design doc with background + details (please comment directly on > doc): > [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7] > tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications > to complete before shutting down, but they cannot run new containers from > those in-progress applications. This is wasteful, particularly in > environments where you are billed by resource usage (e.g. EC2). > Proposal: YARN should schedule containers from in-progress applications on > DECOMMISSIONING nodes, but should still avoid scheduling containers from new > applications. That will make in-progress applications complete faster and let > nodes decommission faster. Overall, this should be cheaper. > I have a working patch without unit tests that's surprisingly just a few real > lines of code (patch 001). If folks are happy with the proposal, I'll write > unit tests and also write a patch targeted at trunk. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability
[ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541321#comment-17541321 ] Masatake Iwasaki commented on YARN-9770: updated the target version for preparing 2.10.2 release. > Create a queue ordering policy which picks child queues with equal probability > -- > > Key: YARN-9770 > URL: https://issues.apache.org/jira/browse/YARN-9770 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-9770.001.patch, YARN-9770.002.patch, > YARN-9770.003.patch, activeUsers_overlay.png > > > Ran some simulations with the default queue_utilization_ordering_policy: > An underutilized queue which receives an application with many (thousands) > resource requests will hog scheduler allocations for a long time (on the > order of a minute). In the meantime apps are getting submitted to all other > queues, which increases activeUsers in these queues, which drops user limit > in these queues to small values if minimum-user-limit-percent is configured > to small values (e.g. 10%). > To avoid this issue, we assign to queues with equal probability, to avoid > scenarios where queues don't get allocations for a long time. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9736) Recursively configure app ordering policies
[ https://issues.apache.org/jira/browse/YARN-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541322#comment-17541322 ] Masatake Iwasaki commented on YARN-9736: updated the target version for preparing 2.10.2 release. > Recursively configure app ordering policies > --- > > Key: YARN-9736 > URL: https://issues.apache.org/jira/browse/YARN-9736 > Project: Hadoop YARN > Issue Type: Task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-9736.001.patch > > > Currently app ordering policy will find confs with prefix > {{.ordering-policy}}. For queues with same ordering policy > configurations it's easier to have a queue inherit confs from its parent. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9736) Recursively configure app ordering policies
[ https://issues.apache.org/jira/browse/YARN-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-9736: --- Target Version/s: 2.10.3 (was: 2.10.2) > Recursively configure app ordering policies > --- > > Key: YARN-9736 > URL: https://issues.apache.org/jira/browse/YARN-9736 > Project: Hadoop YARN > Issue Type: Task >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-9736.001.patch > > > Currently app ordering policy will find confs with prefix > {{.ordering-policy}}. For queues with same ordering policy > configurations it's easier to have a queue inherit confs from its parent. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9770) Create a queue ordering policy which picks child queues with equal probability
[ https://issues.apache.org/jira/browse/YARN-9770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-9770: --- Target Version/s: 2.10.3 (was: 2.10.2) > Create a queue ordering policy which picks child queues with equal probability > -- > > Key: YARN-9770 > URL: https://issues.apache.org/jira/browse/YARN-9770 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung >Priority: Major > Attachments: YARN-9770.001.patch, YARN-9770.002.patch, > YARN-9770.003.patch, activeUsers_overlay.png > > > Ran some simulations with the default queue_utilization_ordering_policy: > An underutilized queue which receives an application with many (thousands) > resource requests will hog scheduler allocations for a long time (on the > order of a minute). In the meantime apps are getting submitted to all other > queues, which increases activeUsers in these queues, which drops user limit > in these queues to small values if minimum-user-limit-percent is configured > to small values (e.g. 10%). > To avoid this issue, we assign to queues with equal probability, to avoid > scenarios where queues don't get allocations for a long time. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9869) Create scheduling policy to auto-adjust queue elasticity based on cluster demand
[ https://issues.apache.org/jira/browse/YARN-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-9869: --- Target Version/s: 2.10.3 (was: 2.10.2) > Create scheduling policy to auto-adjust queue elasticity based on cluster > demand > > > Key: YARN-9869 > URL: https://issues.apache.org/jira/browse/YARN-9869 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Jonathan Hung >Priority: Major > > Currently LinkedIn has a policy to auto-adjust queue elasticity based on > real-time queue demand. We've been running this policy in production for a > long time and it has helped improve overall cluster utilization. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9869) Create scheduling policy to auto-adjust queue elasticity based on cluster demand
[ https://issues.apache.org/jira/browse/YARN-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541320#comment-17541320 ] Masatake Iwasaki commented on YARN-9869: updated the target version for preparing 2.10.2 release. > Create scheduling policy to auto-adjust queue elasticity based on cluster > demand > > > Key: YARN-9869 > URL: https://issues.apache.org/jira/browse/YARN-9869 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Jonathan Hung >Priority: Major > > Currently LinkedIn has a policy to auto-adjust queue elasticity based on > real-time queue demand. We've been running this policy in production for a > long time and it has helped improve overall cluster utilization. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org