[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516619#comment-17516619 ] VADAGA ANANYO RAO commented on YARN-10559: -- Hi [~bteke] , please feel free to take this task over. Thank you > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch, YARN-10559.0009.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10878) TestNMSimulator imports com.google.common.base.Supplier;
[ https://issues.apache.org/jira/browse/YARN-10878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10878: - Attachment: YARN-10878.0001.patch > TestNMSimulator imports com.google.common.base.Supplier; > > > Key: YARN-10878 > URL: https://issues.apache.org/jira/browse/YARN-10878 > Project: Hadoop YARN > Issue Type: Bug > Components: buid >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: VADAGA ANANYO RAO >Priority: Major > Labels: pull-request-available > Attachments: YARN-10878.0001.patch > > Time Spent: 20m > Remaining Estimate: 0h > > TestNMSimulator imports com.google.common.base.Supplier; every build has the > source code patched to fix this, so its creating a false change in builds, > complicating other work, etc etc > the changed file should just be merged -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10878) TestNMSimulator imports com.google.common.base.Supplier;
[ https://issues.apache.org/jira/browse/YARN-10878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO reassigned YARN-10878: Assignee: VADAGA ANANYO RAO > TestNMSimulator imports com.google.common.base.Supplier; > > > Key: YARN-10878 > URL: https://issues.apache.org/jira/browse/YARN-10878 > Project: Hadoop YARN > Issue Type: Bug > Components: buid >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: VADAGA ANANYO RAO >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > TestNMSimulator imports com.google.common.base.Supplier; every build has the > source code patched to fix this, so its creating a false change in builds, > complicating other work, etc etc > the changed file should just be merged -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10628) Add node usage metrics in SLS
[ https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10628: - Attachment: YARN-10628.0004.patch > Add node usage metrics in SLS > - > > Key: YARN-10628 > URL: https://issues.apache.org/jira/browse/YARN-10628 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator >Affects Versions: 3.3.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, > YARN-10628.0001.patch, YARN-10628.0002.patch, YARN-10628.0003.patch, > YARN-10628.0004.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Given the work around container packing going on in YARN schedulers, it would > be beneficial to have charts showing the usage per node in SLS. This will > help to improve container packing algorithms for more efficient packings. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10663) Add runningApps stats in SLS
[ https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10663: - Attachment: YARN-10663.0002.patch > Add runningApps stats in SLS > > > Key: YARN-10663 > URL: https://issues.apache.org/jira/browse/YARN-10663 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10663.0001.patch, YARN-10663.0002.patch > > > RMNodes in SLS don't keep a track of runningApps on each node. Due to this, > graceful decommissioning logic takes a hit as the nodes will decommission if > there are no running containers on the node but some shuffle data was present > on the node. > In this Jira, we will add runningApps functionality in SLS for improving > decommissioning logic of each node. This will help with autoscaling > simulations on SLS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319929#comment-17319929 ] VADAGA ANANYO RAO commented on YARN-10559: -- [~epayne], sorry for seeing your comment so late. We have tested this feature with following configs: Queue Properties: 'yarn.scheduler.capacity..ordering-policy': 'fair' Scheduler configurations: 'yarn.resourcemanager.scheduler.monitor.enable': 'true', 'yarn.resourcemanager.scheduler.monitor.policies' : 'org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy', 'yarn.resourcemanager.monitor.capacity.preemption.intra-queue-preemption.enabled': 'true' Post this, we submit job1 from user1 to a leaf queue. When job1 completely uses up the queue capacity, we trigger job2 from user1 to the same leaf queue. We can observe preemption kicking in for job2 from job1. I am not sure of the exact error you are facing. If you can provide some more details of the problems you are facing, I can try and help out with it. Thank you :) > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch, YARN-10559.0009.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10628) Add node usage metrics in SLS
[ https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10628: - Attachment: YARN-10628.0003.patch > Add node usage metrics in SLS > - > > Key: YARN-10628 > URL: https://issues.apache.org/jira/browse/YARN-10628 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator >Affects Versions: 3.3.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, > YARN-10628.0001.patch, YARN-10628.0002.patch, YARN-10628.0003.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Given the work around container packing going on in YARN schedulers, it would > be beneficial to have charts showing the usage per node in SLS. This will > help to improve container packing algorithms for more efficient packings. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10628) Add node usage metrics in SLS
[ https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10628: - Attachment: YARN-10628.0002.patch > Add node usage metrics in SLS > - > > Key: YARN-10628 > URL: https://issues.apache.org/jira/browse/YARN-10628 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator >Affects Versions: 3.3.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, > YARN-10628.0001.patch, YARN-10628.0002.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Given the work around container packing going on in YARN schedulers, it would > be beneficial to have charts showing the usage per node in SLS. This will > help to improve container packing algorithms for more efficient packings. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10663) Add runningApps stats in SLS
[ https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293751#comment-17293751 ] VADAGA ANANYO RAO commented on YARN-10663: -- Recap of how actual impl code handles running and finished apps on each node: # Each RMNode has a list of runningApplications. Running Apps are active apps which have run some container on that node. # Each RMAppImpl maintains a copy `ranNodes` which are all the nodes on which the app has run containers. # When the app is at its `FinalTransition`, the app iterates over all the ranNodes and triggers a `RMNodeCleanupAppEvent` for that node. # RMNodeImpl handles RMNodeCleanupAppEvent by removing apps from `runningApplications` list to `finishedApplications` list. Based on this flow, I plan to: # Add a `ranNodes` list in AMSimulator. # Each time AMSimulator starts a container on a node (NMSimulator), we will: ## update the runningApps in the NMSimulator and, ## update the ranNodes in the AMSimulator # When the app is finishing, for each node in ranNodes list in AMSimulator, we will remove the app from the runningApps list of that node. > Add runningApps stats in SLS > > > Key: YARN-10663 > URL: https://issues.apache.org/jira/browse/YARN-10663 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10663.0001.patch > > > RMNodes in SLS don't keep a track of runningApps on each node. Due to this, > graceful decommissioning logic takes a hit as the nodes will decommission if > there are no running containers on the node but some shuffle data was present > on the node. > In this Jira, we will add runningApps functionality in SLS for improving > decommissioning logic of each node. This will help with autoscaling > simulations on SLS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9312) NPE while rendering SLS simulate page
[ https://issues.apache.org/jira/browse/YARN-9312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO reassigned YARN-9312: --- Assignee: VADAGA ANANYO RAO > NPE while rendering SLS simulate page > - > > Key: YARN-9312 > URL: https://issues.apache.org/jira/browse/YARN-9312 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin Chundatt >Assignee: VADAGA ANANYO RAO >Priority: Minor > > http://localhost:10001/simulate > {code} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.sls.web.SLSWebApp.printPageSimulate(SLSWebApp.java:240) > at > org.apache.hadoop.yarn.sls.web.SLSWebApp.access$100(SLSWebApp.java:55) > at > org.apache.hadoop.yarn.sls.web.SLSWebApp$1.handle(SLSWebApp.java:152) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > at org.eclipse.jetty.server.Server.handle(Server.java:539) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108) > at > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > at > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10663) Add runningApps stats in SLS
[ https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10663: - Attachment: (was: YARN-10663.0001.patch) > Add runningApps stats in SLS > > > Key: YARN-10663 > URL: https://issues.apache.org/jira/browse/YARN-10663 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10663.0001.patch > > > RMNodes in SLS don't keep a track of runningApps on each node. Due to this, > graceful decommissioning logic takes a hit as the nodes will decommission if > there are no running containers on the node but some shuffle data was present > on the node. > In this Jira, we will add runningApps functionality in SLS for improving > decommissioning logic of each node. This will help with autoscaling > simulations on SLS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10663) Add runningApps stats in SLS
[ https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10663: - Attachment: YARN-10663.0001.patch > Add runningApps stats in SLS > > > Key: YARN-10663 > URL: https://issues.apache.org/jira/browse/YARN-10663 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10663.0001.patch > > > RMNodes in SLS don't keep a track of runningApps on each node. Due to this, > graceful decommissioning logic takes a hit as the nodes will decommission if > there are no running containers on the node but some shuffle data was present > on the node. > In this Jira, we will add runningApps functionality in SLS for improving > decommissioning logic of each node. This will help with autoscaling > simulations on SLS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10663) Add runningApps stats in SLS
[ https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10663: - Attachment: YARN-10663.0001.patch > Add runningApps stats in SLS > > > Key: YARN-10663 > URL: https://issues.apache.org/jira/browse/YARN-10663 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10663.0001.patch > > > RMNodes in SLS don't keep a track of runningApps on each node. Due to this, > graceful decommissioning logic takes a hit as the nodes will decommission if > there are no running containers on the node but some shuffle data was present > on the node. > In this Jira, we will add runningApps functionality in SLS for improving > decommissioning logic of each node. This will help with autoscaling > simulations on SLS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10663) Add runningApps stats in SLS
VADAGA ANANYO RAO created YARN-10663: Summary: Add runningApps stats in SLS Key: YARN-10663 URL: https://issues.apache.org/jira/browse/YARN-10663 Project: Hadoop YARN Issue Type: Improvement Components: yarn Reporter: VADAGA ANANYO RAO Assignee: VADAGA ANANYO RAO RMNodes in SLS don't keep a track of runningApps on each node. Due to this, graceful decommissioning logic takes a hit as the nodes will decommission if there are no running containers on the node but some shuffle data was present on the node. In this Jira, we will add runningApps functionality in SLS for improving decommissioning logic of each node. This will help with autoscaling simulations on SLS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10628) Add node usage metrics in SLS
[ https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10628: - Attachment: YARN-10628.0001.patch > Add node usage metrics in SLS > - > > Key: YARN-10628 > URL: https://issues.apache.org/jira/browse/YARN-10628 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator >Affects Versions: 3.3.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, > YARN-10628.0001.patch > > Original Estimate: 336h > Remaining Estimate: 336h > > Given the work around container packing going on in YARN schedulers, it would > be beneficial to have charts showing the usage per node in SLS. This will > help to improve container packing algorithms for more efficient packings. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10628) Add node usage metrics in SLS
VADAGA ANANYO RAO created YARN-10628: Summary: Add node usage metrics in SLS Key: YARN-10628 URL: https://issues.apache.org/jira/browse/YARN-10628 Project: Hadoop YARN Issue Type: Improvement Components: scheduler-load-simulator Affects Versions: 3.3.1 Reporter: VADAGA ANANYO RAO Assignee: VADAGA ANANYO RAO Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png Given the work around container packing going on in YARN schedulers, it would be beneficial to have charts showing the usage per node in SLS. This will help to improve container packing algorithms for more efficient packings. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached
[ https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10617: - Attachment: YARN-10617.0001.patch > Fifo and Fair intra-queue preemption goes on indefinitely when apps are in > pending state due to max AM limit reached > > > Key: YARN-10617 > URL: https://issues.apache.org/jira/browse/YARN-10617 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.1.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10617.0001.patch > > > This case occurs when: > 1. an application gets submitted in a cluster running at max-AM limit. > 2. The new job requests AM resource. So it has 1 pending request. > 3. To fulfil this request, the preemption logic preempts 1 resource from a > running app. > 4. Because the cluster is at max-AM limit, the scheduler re-assigns the > preempted container back to the running app. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached
[ https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10617: - Attachment: (was: YARN-10617.patch) > Fifo and Fair intra-queue preemption goes on indefinitely when apps are in > pending state due to max AM limit reached > > > Key: YARN-10617 > URL: https://issues.apache.org/jira/browse/YARN-10617 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.1.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10617.0001.patch > > > This case occurs when: > 1. an application gets submitted in a cluster running at max-AM limit. > 2. The new job requests AM resource. So it has 1 pending request. > 3. To fulfil this request, the preemption logic preempts 1 resource from a > running app. > 4. Because the cluster is at max-AM limit, the scheduler re-assigns the > preempted container back to the running app. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached
[ https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281513#comment-17281513 ] VADAGA ANANYO RAO commented on YARN-10617: -- Hi [~leftnoteasy] [~sunilg] , could you please review this jira? The fix is in proportional capacity preemption logic. Basically, instead of considering all apps for preemption, we only consider apps which are schedulable by the scheduling logic we are using. cc: [~epayne] Thanks. > Fifo and Fair intra-queue preemption goes on indefinitely when apps are in > pending state due to max AM limit reached > > > Key: YARN-10617 > URL: https://issues.apache.org/jira/browse/YARN-10617 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.1.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10617.patch > > > This case occurs when: > 1. an application gets submitted in a cluster running at max-AM limit. > 2. The new job requests AM resource. So it has 1 pending request. > 3. To fulfil this request, the preemption logic preempts 1 resource from a > running app. > 4. Because the cluster is at max-AM limit, the scheduler re-assigns the > preempted container back to the running app. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached
[ https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10617: - Attachment: YARN-10617.patch > Fifo and Fair intra-queue preemption goes on indefinitely when apps are in > pending state due to max AM limit reached > > > Key: YARN-10617 > URL: https://issues.apache.org/jira/browse/YARN-10617 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.1.1 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: YARN-10617.patch > > > This case occurs when: > 1. an application gets submitted in a cluster running at max-AM limit. > 2. The new job requests AM resource. So it has 1 pending request. > 3. To fulfil this request, the preemption logic preempts 1 resource from a > running app. > 4. Because the cluster is at max-AM limit, the scheduler re-assigns the > preempted container back to the running app. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached
VADAGA ANANYO RAO created YARN-10617: Summary: Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached Key: YARN-10617 URL: https://issues.apache.org/jira/browse/YARN-10617 Project: Hadoop YARN Issue Type: Improvement Components: capacity scheduler Affects Versions: 3.1.1 Reporter: VADAGA ANANYO RAO Assignee: VADAGA ANANYO RAO This case occurs when: 1. an application gets submitted in a cluster running at max-AM limit. 2. The new job requests AM resource. So it has 1 pending request. 3. To fulfil this request, the preemption logic preempts 1 resource from a running app. 4. Because the cluster is at max-AM limit, the scheduler re-assigns the preempted container back to the running app. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0009.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch, YARN-10559.0009.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276904#comment-17276904 ] VADAGA ANANYO RAO edited comment on YARN-10559 at 2/5/21, 2:29 PM: --- Hi [~epayne], thank you for your earlier response. We wanted to get your suggestions on better handling multiple user cases. Currently, we are considering a formula to calculate FairShare per app like: {code:java} foreach user: if(tq.leafqueue.getUserLimit == 100) fairSharePerApp = total Queue Capacity / # of apps of that user; else fairSharePerApp = UL / # of apps of that user; {code} So, according to the above formula, Say we have a scenario with 2 users and UserLimit = 100%, User1 (UL = 100%, fairSharePerUser = 50%) * App1 (fairSharePerApp = 50%) User2 (UL = 100%, fairSharePerUser = 50%) * App2 (fairSharePerApp = 25%) * App3 (fairSharePerApp = 25%) Do you see any shortcomings in this formula or can suggest better ways of handling multiple user issue? I would really appreciate it :) cc: [~sunilg] [~wangda] was (Author: ananyo_rao): Hi [~epayne], thank you for your earlier response. We wanted to get your suggestions on better handling multiple user cases. Currently, we are considering a formula to calculate FairShare per app like: {code:java} fairSharePerUser = total Queue Capacity / # of users foreach user: if(tq.leafqueue.getUserLimit == 100) fairSharePerApp = fairSharePerUser / # of apps of that user; else fairSharePerApp = UL / # of apps of that user; {code} So, according to the above formula, Say we have a scenario with 2 users and UserLimit = 100%, User1 (UL = 100%, fairSharePerUser = 50%) * App1 (fairSharePerApp = 50%) User2 (UL = 100%, fairSharePerUser = 50%) * App2 (fairSharePerApp = 25%) * App3 (fairSharePerApp = 25%) Do you see any shortcomings in this formula or can suggest better ways of handling multiple user issue? I would really appreciate it :) cc: [~sunilg] [~wangda] > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276904#comment-17276904 ] VADAGA ANANYO RAO edited comment on YARN-10559 at 2/5/21, 2:29 PM: --- Hi [~epayne], thank you for your earlier response. We wanted to get your suggestions on better handling multiple user cases. Currently, we are considering a formula to calculate FairShare per app like: {code:java} foreach user: if(tq.leafqueue.getUserLimit == 100) fairSharePerApp = total Queue Capacity / # of apps of that user; else fairSharePerApp = UL / # of apps of that user; {code} So, according to the above formula, Say we have a scenario with 2 users and UserLimit = 100%, User1 (UL = 100%, fairSharePerUser = 100%) * App1 (fairSharePerApp = 100%) User2 (UL = 100%, fairSharePerUser = 100%) * App2 (fairSharePerApp = 50%) * App3 (fairSharePerApp = 50%) Do you see any shortcomings in this formula or can suggest better ways of handling multiple user issue? I would really appreciate it :) cc: [~sunilg] [~wangda] was (Author: ananyo_rao): Hi [~epayne], thank you for your earlier response. We wanted to get your suggestions on better handling multiple user cases. Currently, we are considering a formula to calculate FairShare per app like: {code:java} foreach user: if(tq.leafqueue.getUserLimit == 100) fairSharePerApp = total Queue Capacity / # of apps of that user; else fairSharePerApp = UL / # of apps of that user; {code} So, according to the above formula, Say we have a scenario with 2 users and UserLimit = 100%, User1 (UL = 100%, fairSharePerUser = 50%) * App1 (fairSharePerApp = 50%) User2 (UL = 100%, fairSharePerUser = 50%) * App2 (fairSharePerApp = 25%) * App3 (fairSharePerApp = 25%) Do you see any shortcomings in this formula or can suggest better ways of handling multiple user issue? I would really appreciate it :) cc: [~sunilg] [~wangda] > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276904#comment-17276904 ] VADAGA ANANYO RAO commented on YARN-10559: -- Hi [~epayne], thank you for your earlier response. We wanted to get your suggestions on better handling multiple user cases. Currently, we are considering a formula to calculate FairShare per app like: {code:java} fairSharePerUser = total Queue Capacity / # of users foreach user: if(tq.leafqueue.getUserLimit == 100) fairSharePerApp = fairSharePerUser / # of apps of that user; else fairSharePerApp = UL / # of apps of that user; {code} So, according to the above formula, Say we have a scenario with 2 users and UserLimit = 100%, User1 (UL = 100%, fairSharePerUser = 50%) * App1 (fairSharePerApp = 50%) User2 (UL = 100%, fairSharePerUser = 50%) * App2 (fairSharePerApp = 25%) * App3 (fairSharePerApp = 25%) Do you see any shortcomings in this formula or can suggest better ways of handling multiple user issue? I would really appreciate it :) cc: [~sunilg] [~wangda] > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276425#comment-17276425 ] VADAGA ANANYO RAO commented on YARN-10559: -- [~epayne], thanks for catching this. This is a major bug in the code. I am already working for addressing multiple user scenarios and should be able to get a patch to fix this in a couple of days. > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10591) Allow USERLIMIT_FIRST as only valid configuration when FairOrdering is used
VADAGA ANANYO RAO created YARN-10591: Summary: Allow USERLIMIT_FIRST as only valid configuration when FairOrdering is used Key: YARN-10591 URL: https://issues.apache.org/jira/browse/YARN-10591 Project: Hadoop YARN Issue Type: Improvement Components: capacity scheduler Reporter: VADAGA ANANYO RAO Assignee: VADAGA ANANYO RAO When FairOrderingPolicy is being used in CapacityScheduler, we should only allow for USERLIMIT_FIRST. The alternate option to USERLIMIT_FIRST is PRIORITY_FIRST. However, application priorities create anti-patterns with fairness. So we should not allow for PRIORITY_FIRST to be set. This Jira is to add validation check to ensure only USERLIMIT_FIRST is set if FairOrderingPolicy is being used. cc: [~sunilg] [~leftnoteasy] [~epayne] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0008.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, > YARN-10559.0008.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0007.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0006.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: (was: YARN-10559.0006.patch) > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0006.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: (was: YARN-10559.0006.patch) > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267394#comment-17267394 ] VADAGA ANANYO RAO commented on YARN-10559: -- Updated latest patch to skip user headroom calculations in case of FairOrderingPolicy. This was missed in the previous patch. Skipping user headroom calculation is required because 2 jobs from the same user may have unfair resource distribution and so we may still want to consider a user for preemption, even if the user has reached its max headroom. cc: [~sunilg] > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0006.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch, YARN-10559.0006.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263252#comment-17263252 ] VADAGA ANANYO RAO edited comment on YARN-10559 at 1/12/21, 11:09 AM: - Just FYI, the UT failure is not related to the code changes in this JIRA. was (Author: ananyo_rao): Just FYI, the UT failure is not related to the code changes. > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263252#comment-17263252 ] VADAGA ANANYO RAO commented on YARN-10559: -- Just FYI, the UT failure is not related to the code changes. > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263135#comment-17263135 ] VADAGA ANANYO RAO commented on YARN-10559: -- Thank you [~sunilg] and [~wangda] for the comments. I have tried addressing the comments in the new patch. Following are the changes: # Fixed the *checkstyle* warnings. # Added *check if 0 active apps* are there in the leaf queue. _Note:_ we won't have to check for 0 active apps in users because users are temp objects initialised when apps are getting created. So each user will at least have 1 corresponding active app. # Ensured *fair-share is non-negative* value. However, fair-share can be 0 value. So not adding checks for that. Fair-share can be 0 in following situations _(its a non-exhaustive list of conditions)_: ## User-limit is somehow 0 for a user ## queueReassignableResources are 0 ## No running apps in the queue # If a container needs to be skipped from preemption in *skipContainerBasedOnIntraQueuePolicy*, we will continue checks for other containers of the app instead of breaking from the app. # Added additional *logs* for when we reduce actually_to_be_preempted of an app. Thank you [~sunilg] for catching these points. Please let me know if some points needs to be addressed. > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0005.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: (was: YARN-10559.0005.patch) > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0005.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, > YARN-10559.0005.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0004.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Fix For: 3.1.4 > > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0003.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Fix For: 3.1.4 > > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch, YARN-10559.0003.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0002.patch > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Fix For: 3.1.4 > > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: (was: YARN-10559.0002.patch) > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Fix For: 3.1.4 > > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261080#comment-17261080 ] VADAGA ANANYO RAO commented on YARN-10559: -- [~sunilg] I have updated the latest patch and design doc. The FairShare calculation for preemption in this patch is: {code:java} fairSharePerApp = total Queue Cap / no: of apps in the queue idealFairSharePerAppwithUL = UL / no: of apps of that user if(apps of user * fairSharePerApp > UL) fairSharePerApp = idealFairSharePerAppwithUL; {code} Say we have 2 users and following scenario1 with UL = 100%: {noformat} User1: (UL: 100%) -App1 (used = 50%, pending = 100%, FS = 33%) -App3 (used = 50%, pending = 100%, FS = 33%) User2 (UL: 100%) -App3 (used = 0%, pending = 100%, FS = 33%) {noformat} We will take 17% resources from app1 and app2 to give to app3. Now, say we have 2 users with UL = 50%: {noformat} User1: (UL: 50%) -App1 (used = 50%, pending = 100%, FS = 25%) -App3 (used = 50%, pending = 100%, FS = 25%) User2 (UL: 50%) -App3 (used = 0%, pending = 100%, FS = 33%) {noformat} We will take ~17% resources from app1 and app2 to give to app3. We may be under-preempting from user1 in this case. But we ensure user2 is not heavily starved. This logic seems to be functioning closest to the FairOrderingPolicy in CapacityScheduler. Comparisons of other considered algorithms is present in design_doc_v2. Once again, thank you [~sunilg] for guiding with the design. [~epayne] [~leftnoteasy] Please do review the latest patch-2. Thank you. > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Fix For: 3.1.4 > > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] VADAGA ANANYO RAO updated YARN-10559: - Attachment: YARN-10559.0002.patch FairOP_preemption-design_doc_v2.pdf > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Assignee: VADAGA ANANYO RAO >Priority: Major > Fix For: 3.1.4 > > Attachments: FairOP_preemption-design_doc_v1.pdf, > FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, > YARN-10559.0002.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258682#comment-17258682 ] VADAGA ANANYO RAO commented on YARN-10559: -- [~sunilg] Thank you for helping with the design. Can you please review the patch uploaded? In patch-0001, the fair-share for preemption is calculated as: *"FairShare per app = UserLimit of the user of the app / Number of apps for that user"*. + [~wangda] > Fair sharing intra-queue preemption support in Capacity Scheduler > - > > Key: YARN-10559 > URL: https://issues.apache.org/jira/browse/YARN-10559 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 3.1.4 >Reporter: VADAGA ANANYO RAO >Priority: Major > Fix For: 3.1.4 > > Attachments: FairOP_preemption-design_doc_v1.pdf, > YARN-10559.0001.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > Usecase: > Due to the way Capacity Scheduler preemption works, If a single user submits > a large application to a queue (using 100% of resources), that job will not > be preempted by future applications from the same user within the same queue. > This implies that the later applications will be forced to wait for > completion of the long running application. This prevents multiple long > running, large, applications from running concurrently. > Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler
VADAGA ANANYO RAO created YARN-10559: Summary: Fair sharing intra-queue preemption support in Capacity Scheduler Key: YARN-10559 URL: https://issues.apache.org/jira/browse/YARN-10559 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 3.1.4 Reporter: VADAGA ANANYO RAO Fix For: 3.1.4 Attachments: FairOP_preemption-design_doc_v1.pdf Usecase: Due to the way Capacity Scheduler preemption works, If a single user submits a large application to a queue (using 100% of resources), that job will not be preempted by future applications from the same user within the same queue. This implies that the later applications will be forced to wait for completion of the long running application. This prevents multiple long running, large, applications from running concurrently. Support fair sharing among apps while preempting applications from same queue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org