[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2022-04-03 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516619#comment-17516619
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

Hi [~bteke] , please feel free to take this task over.

Thank you

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch, YARN-10559.0009.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10878) TestNMSimulator imports com.google.common.base.Supplier;

2021-08-05 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10878:
-
Attachment: YARN-10878.0001.patch

> TestNMSimulator imports com.google.common.base.Supplier;
> 
>
> Key: YARN-10878
> URL: https://issues.apache.org/jira/browse/YARN-10878
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: buid
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: VADAGA ANANYO RAO
>Priority: Major
>  Labels: pull-request-available
> Attachments: YARN-10878.0001.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> TestNMSimulator imports com.google.common.base.Supplier; every build has the 
> source code patched to fix this, so its creating a false change in builds, 
> complicating other work, etc etc
> the changed file should just be merged



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10878) TestNMSimulator imports com.google.common.base.Supplier;

2021-08-05 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO reassigned YARN-10878:


Assignee: VADAGA ANANYO RAO

> TestNMSimulator imports com.google.common.base.Supplier;
> 
>
> Key: YARN-10878
> URL: https://issues.apache.org/jira/browse/YARN-10878
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: buid
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: VADAGA ANANYO RAO
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> TestNMSimulator imports com.google.common.base.Supplier; every build has the 
> source code patched to fix this, so its creating a false change in builds, 
> complicating other work, etc etc
> the changed file should just be merged



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10628) Add node usage metrics in SLS

2021-07-05 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10628:
-
Attachment: YARN-10628.0004.patch

> Add node usage metrics in SLS
> -
>
> Key: YARN-10628
> URL: https://issues.apache.org/jira/browse/YARN-10628
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.3.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, 
> YARN-10628.0001.patch, YARN-10628.0002.patch, YARN-10628.0003.patch, 
> YARN-10628.0004.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Given the work around container packing going on in YARN schedulers, it would 
> be beneficial to have charts showing the usage per node in SLS. This will 
> help to improve container packing algorithms for more efficient packings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10663) Add runningApps stats in SLS

2021-07-04 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10663:
-
Attachment: YARN-10663.0002.patch

> Add runningApps stats in SLS
> 
>
> Key: YARN-10663
> URL: https://issues.apache.org/jira/browse/YARN-10663
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10663.0001.patch, YARN-10663.0002.patch
>
>
> RMNodes in SLS don't keep a track of runningApps on each node. Due to this, 
> graceful decommissioning logic takes a hit as the nodes will decommission if 
> there are no running containers on the node but some shuffle data was present 
> on the node.
> In this Jira, we will add runningApps functionality in SLS for improving 
> decommissioning logic of each node. This will help with autoscaling 
> simulations on SLS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-04-12 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319929#comment-17319929
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

[~epayne], sorry for seeing your comment so late. We have tested this feature 
with following configs:
Queue Properties:
'yarn.scheduler.capacity..ordering-policy': 'fair'

Scheduler configurations:
'yarn.resourcemanager.scheduler.monitor.enable': 'true',
'yarn.resourcemanager.scheduler.monitor.policies' : 
'org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy',
'yarn.resourcemanager.monitor.capacity.preemption.intra-queue-preemption.enabled':
 'true'
Post this, we submit job1 from user1 to a leaf queue. When job1 completely uses 
up the queue capacity, we trigger job2 from user1 to the same leaf queue. We 
can observe preemption kicking in for job2 from job1.
I am not sure of the exact error you are facing. If you can provide some more 
details of the problems you are facing, I can try and help out with it.

Thank you :)

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch, YARN-10559.0009.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10628) Add node usage metrics in SLS

2021-03-08 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10628:
-
Attachment: YARN-10628.0003.patch

> Add node usage metrics in SLS
> -
>
> Key: YARN-10628
> URL: https://issues.apache.org/jira/browse/YARN-10628
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.3.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, 
> YARN-10628.0001.patch, YARN-10628.0002.patch, YARN-10628.0003.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Given the work around container packing going on in YARN schedulers, it would 
> be beneficial to have charts showing the usage per node in SLS. This will 
> help to improve container packing algorithms for more efficient packings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10628) Add node usage metrics in SLS

2021-03-07 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10628:
-
Attachment: YARN-10628.0002.patch

> Add node usage metrics in SLS
> -
>
> Key: YARN-10628
> URL: https://issues.apache.org/jira/browse/YARN-10628
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.3.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, 
> YARN-10628.0001.patch, YARN-10628.0002.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Given the work around container packing going on in YARN schedulers, it would 
> be beneficial to have charts showing the usage per node in SLS. This will 
> help to improve container packing algorithms for more efficient packings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10663) Add runningApps stats in SLS

2021-03-02 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293751#comment-17293751
 ] 

VADAGA ANANYO RAO commented on YARN-10663:
--

Recap of how actual impl code handles running and finished apps on each node:
 # Each RMNode has a list of runningApplications. Running Apps are active apps 
which have run some container on that node.
 # Each RMAppImpl maintains a copy `ranNodes` which are all the nodes on which 
the app has run containers.
 # When the app is at its `FinalTransition`, the app iterates over all the 
ranNodes and triggers a `RMNodeCleanupAppEvent` for that node.
 # RMNodeImpl handles RMNodeCleanupAppEvent by removing apps from 
`runningApplications` list to `finishedApplications` list.

Based on this flow, I plan to:
 # Add a `ranNodes` list in AMSimulator.
 # Each time AMSimulator starts a container on a node (NMSimulator), we will:
 ## update the runningApps in the NMSimulator and,
 ## update the ranNodes in the AMSimulator
 # When the app is finishing, for each node in ranNodes list in AMSimulator, we 
will remove the app from the runningApps list of that node.

> Add runningApps stats in SLS
> 
>
> Key: YARN-10663
> URL: https://issues.apache.org/jira/browse/YARN-10663
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10663.0001.patch
>
>
> RMNodes in SLS don't keep a track of runningApps on each node. Due to this, 
> graceful decommissioning logic takes a hit as the nodes will decommission if 
> there are no running containers on the node but some shuffle data was present 
> on the node.
> In this Jira, we will add runningApps functionality in SLS for improving 
> decommissioning logic of each node. This will help with autoscaling 
> simulations on SLS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9312) NPE while rendering SLS simulate page

2021-03-02 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO reassigned YARN-9312:
---

Assignee: VADAGA ANANYO RAO

> NPE while rendering SLS simulate page
> -
>
> Key: YARN-9312
> URL: https://issues.apache.org/jira/browse/YARN-9312
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Assignee: VADAGA ANANYO RAO
>Priority: Minor
>
> http://localhost:10001/simulate
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.sls.web.SLSWebApp.printPageSimulate(SLSWebApp.java:240)
> at 
> org.apache.hadoop.yarn.sls.web.SLSWebApp.access$100(SLSWebApp.java:55)
> at 
> org.apache.hadoop.yarn.sls.web.SLSWebApp$1.handle(SLSWebApp.java:152)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:539)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
> at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
> at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10663) Add runningApps stats in SLS

2021-03-02 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10663:
-
Attachment: (was: YARN-10663.0001.patch)

> Add runningApps stats in SLS
> 
>
> Key: YARN-10663
> URL: https://issues.apache.org/jira/browse/YARN-10663
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10663.0001.patch
>
>
> RMNodes in SLS don't keep a track of runningApps on each node. Due to this, 
> graceful decommissioning logic takes a hit as the nodes will decommission if 
> there are no running containers on the node but some shuffle data was present 
> on the node.
> In this Jira, we will add runningApps functionality in SLS for improving 
> decommissioning logic of each node. This will help with autoscaling 
> simulations on SLS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10663) Add runningApps stats in SLS

2021-03-02 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10663:
-
Attachment: YARN-10663.0001.patch

> Add runningApps stats in SLS
> 
>
> Key: YARN-10663
> URL: https://issues.apache.org/jira/browse/YARN-10663
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10663.0001.patch
>
>
> RMNodes in SLS don't keep a track of runningApps on each node. Due to this, 
> graceful decommissioning logic takes a hit as the nodes will decommission if 
> there are no running containers on the node but some shuffle data was present 
> on the node.
> In this Jira, we will add runningApps functionality in SLS for improving 
> decommissioning logic of each node. This will help with autoscaling 
> simulations on SLS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10663) Add runningApps stats in SLS

2021-03-02 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10663:
-
Attachment: YARN-10663.0001.patch

> Add runningApps stats in SLS
> 
>
> Key: YARN-10663
> URL: https://issues.apache.org/jira/browse/YARN-10663
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10663.0001.patch
>
>
> RMNodes in SLS don't keep a track of runningApps on each node. Due to this, 
> graceful decommissioning logic takes a hit as the nodes will decommission if 
> there are no running containers on the node but some shuffle data was present 
> on the node.
> In this Jira, we will add runningApps functionality in SLS for improving 
> decommissioning logic of each node. This will help with autoscaling 
> simulations on SLS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10663) Add runningApps stats in SLS

2021-03-02 Thread VADAGA ANANYO RAO (Jira)
VADAGA ANANYO RAO created YARN-10663:


 Summary: Add runningApps stats in SLS
 Key: YARN-10663
 URL: https://issues.apache.org/jira/browse/YARN-10663
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: VADAGA ANANYO RAO
Assignee: VADAGA ANANYO RAO


RMNodes in SLS don't keep a track of runningApps on each node. Due to this, 
graceful decommissioning logic takes a hit as the nodes will decommission if 
there are no running containers on the node but some shuffle data was present 
on the node.

In this Jira, we will add runningApps functionality in SLS for improving 
decommissioning logic of each node. This will help with autoscaling simulations 
on SLS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10628) Add node usage metrics in SLS

2021-02-15 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10628:
-
Attachment: YARN-10628.0001.patch

> Add node usage metrics in SLS
> -
>
> Key: YARN-10628
> URL: https://issues.apache.org/jira/browse/YARN-10628
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler-load-simulator
>Affects Versions: 3.3.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png, 
> YARN-10628.0001.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Given the work around container packing going on in YARN schedulers, it would 
> be beneficial to have charts showing the usage per node in SLS. This will 
> help to improve container packing algorithms for more efficient packings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10628) Add node usage metrics in SLS

2021-02-15 Thread VADAGA ANANYO RAO (Jira)
VADAGA ANANYO RAO created YARN-10628:


 Summary: Add node usage metrics in SLS
 Key: YARN-10628
 URL: https://issues.apache.org/jira/browse/YARN-10628
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler-load-simulator
Affects Versions: 3.3.1
Reporter: VADAGA ANANYO RAO
Assignee: VADAGA ANANYO RAO
 Attachments: Nodes_memory_usage.png, Nodes_vcores_usage.png

Given the work around container packing going on in YARN schedulers, it would 
be beneficial to have charts showing the usage per node in SLS. This will help 
to improve container packing algorithms for more efficient packings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached

2021-02-14 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10617:
-
Attachment: YARN-10617.0001.patch

> Fifo and Fair intra-queue preemption goes on indefinitely when apps are in 
> pending state due to max AM limit reached
> 
>
> Key: YARN-10617
> URL: https://issues.apache.org/jira/browse/YARN-10617
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.1.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10617.0001.patch
>
>
> This case occurs when:
> 1. an application gets submitted in a cluster running at max-AM limit.
> 2. The new job requests AM resource. So it has 1 pending request.
> 3. To fulfil this request, the preemption logic preempts 1 resource from a 
> running app.
> 4. Because the cluster is at max-AM limit, the scheduler re-assigns the 
> preempted container back to the running app.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached

2021-02-14 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10617:
-
Attachment: (was: YARN-10617.patch)

> Fifo and Fair intra-queue preemption goes on indefinitely when apps are in 
> pending state due to max AM limit reached
> 
>
> Key: YARN-10617
> URL: https://issues.apache.org/jira/browse/YARN-10617
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.1.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10617.0001.patch
>
>
> This case occurs when:
> 1. an application gets submitted in a cluster running at max-AM limit.
> 2. The new job requests AM resource. So it has 1 pending request.
> 3. To fulfil this request, the preemption logic preempts 1 resource from a 
> running app.
> 4. Because the cluster is at max-AM limit, the scheduler re-assigns the 
> preempted container back to the running app.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached

2021-02-08 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17281513#comment-17281513
 ] 

VADAGA ANANYO RAO commented on YARN-10617:
--

Hi [~leftnoteasy] [~sunilg] , could you please review this jira? The fix is in 
proportional capacity preemption logic.

Basically, instead of considering all apps for preemption, we only consider 
apps which are schedulable by the scheduling logic we are using.

cc: [~epayne]

Thanks.

> Fifo and Fair intra-queue preemption goes on indefinitely when apps are in 
> pending state due to max AM limit reached
> 
>
> Key: YARN-10617
> URL: https://issues.apache.org/jira/browse/YARN-10617
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.1.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10617.patch
>
>
> This case occurs when:
> 1. an application gets submitted in a cluster running at max-AM limit.
> 2. The new job requests AM resource. So it has 1 pending request.
> 3. To fulfil this request, the preemption logic preempts 1 resource from a 
> running app.
> 4. Because the cluster is at max-AM limit, the scheduler re-assigns the 
> preempted container back to the running app.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached

2021-02-08 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10617:
-
Attachment: YARN-10617.patch

> Fifo and Fair intra-queue preemption goes on indefinitely when apps are in 
> pending state due to max AM limit reached
> 
>
> Key: YARN-10617
> URL: https://issues.apache.org/jira/browse/YARN-10617
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.1.1
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: YARN-10617.patch
>
>
> This case occurs when:
> 1. an application gets submitted in a cluster running at max-AM limit.
> 2. The new job requests AM resource. So it has 1 pending request.
> 3. To fulfil this request, the preemption logic preempts 1 resource from a 
> running app.
> 4. Because the cluster is at max-AM limit, the scheduler re-assigns the 
> preempted container back to the running app.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10617) Fifo and Fair intra-queue preemption goes on indefinitely when apps are in pending state due to max AM limit reached

2021-02-08 Thread VADAGA ANANYO RAO (Jira)
VADAGA ANANYO RAO created YARN-10617:


 Summary: Fifo and Fair intra-queue preemption goes on indefinitely 
when apps are in pending state due to max AM limit reached
 Key: YARN-10617
 URL: https://issues.apache.org/jira/browse/YARN-10617
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacity scheduler
Affects Versions: 3.1.1
Reporter: VADAGA ANANYO RAO
Assignee: VADAGA ANANYO RAO


This case occurs when:
1. an application gets submitted in a cluster running at max-AM limit.
2. The new job requests AM resource. So it has 1 pending request.
3. To fulfil this request, the preemption logic preempts 1 resource from a 
running app.
4. Because the cluster is at max-AM limit, the scheduler re-assigns the 
preempted container back to the running app.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-02-05 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0009.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch, YARN-10559.0009.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-02-05 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276904#comment-17276904
 ] 

VADAGA ANANYO RAO edited comment on YARN-10559 at 2/5/21, 2:29 PM:
---

Hi [~epayne], thank you for your earlier response. We wanted to get your 
suggestions on better handling multiple user cases. Currently, we are 
considering a formula to calculate FairShare per app like:
{code:java}
foreach user:
  if(tq.leafqueue.getUserLimit == 100) 
fairSharePerApp = total Queue Capacity / # of apps of that user;
  else
fairSharePerApp = UL / # of apps of that user;
{code}
So, according to the above formula, Say we have a scenario with 2 users and 
UserLimit = 100%,

 

User1 (UL = 100%, fairSharePerUser = 50%)
 * App1 (fairSharePerApp = 50%)

User2 (UL = 100%, fairSharePerUser = 50%)
 * App2 (fairSharePerApp = 25%)
 * App3 (fairSharePerApp = 25%)

Do you see any shortcomings in this formula or can suggest better ways of 
handling multiple user issue? I would really appreciate it :)
 cc: [~sunilg] [~wangda]


was (Author: ananyo_rao):
Hi [~epayne], thank you for your earlier response. We wanted to get your 
suggestions on better handling multiple user cases. Currently, we are 
considering a formula to calculate FairShare per app like:
{code:java}
fairSharePerUser = total Queue Capacity / # of users
foreach user:
  if(tq.leafqueue.getUserLimit == 100) 
fairSharePerApp = fairSharePerUser / # of apps of that user;
  else
fairSharePerApp = UL / # of apps of that user;
{code}
So, according to the above formula, Say we have a scenario with 2 users and 
UserLimit = 100%,

 

User1 (UL = 100%, fairSharePerUser = 50%)
 * App1 (fairSharePerApp = 50%)

User2 (UL = 100%, fairSharePerUser = 50%)
 * App2 (fairSharePerApp = 25%)
 * App3 (fairSharePerApp = 25%)

Do you see any shortcomings in this formula or can suggest better ways of 
handling multiple user issue? I would really appreciate it :)
cc: [~sunilg] [~wangda]

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-02-05 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276904#comment-17276904
 ] 

VADAGA ANANYO RAO edited comment on YARN-10559 at 2/5/21, 2:29 PM:
---

Hi [~epayne], thank you for your earlier response. We wanted to get your 
suggestions on better handling multiple user cases. Currently, we are 
considering a formula to calculate FairShare per app like:
{code:java}
foreach user:
  if(tq.leafqueue.getUserLimit == 100) 
fairSharePerApp = total Queue Capacity / # of apps of that user;
  else
fairSharePerApp = UL / # of apps of that user;
{code}
So, according to the above formula, Say we have a scenario with 2 users and 
UserLimit = 100%,

 

User1 (UL = 100%, fairSharePerUser = 100%)
 * App1 (fairSharePerApp = 100%)

User2 (UL = 100%, fairSharePerUser = 100%)
 * App2 (fairSharePerApp = 50%)
 * App3 (fairSharePerApp = 50%)

Do you see any shortcomings in this formula or can suggest better ways of 
handling multiple user issue? I would really appreciate it :)
 cc: [~sunilg] [~wangda]


was (Author: ananyo_rao):
Hi [~epayne], thank you for your earlier response. We wanted to get your 
suggestions on better handling multiple user cases. Currently, we are 
considering a formula to calculate FairShare per app like:
{code:java}
foreach user:
  if(tq.leafqueue.getUserLimit == 100) 
fairSharePerApp = total Queue Capacity / # of apps of that user;
  else
fairSharePerApp = UL / # of apps of that user;
{code}
So, according to the above formula, Say we have a scenario with 2 users and 
UserLimit = 100%,

 

User1 (UL = 100%, fairSharePerUser = 50%)
 * App1 (fairSharePerApp = 50%)

User2 (UL = 100%, fairSharePerUser = 50%)
 * App2 (fairSharePerApp = 25%)
 * App3 (fairSharePerApp = 25%)

Do you see any shortcomings in this formula or can suggest better ways of 
handling multiple user issue? I would really appreciate it :)
 cc: [~sunilg] [~wangda]

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-02-01 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276904#comment-17276904
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

Hi [~epayne], thank you for your earlier response. We wanted to get your 
suggestions on better handling multiple user cases. Currently, we are 
considering a formula to calculate FairShare per app like:
{code:java}
fairSharePerUser = total Queue Capacity / # of users
foreach user:
  if(tq.leafqueue.getUserLimit == 100) 
fairSharePerApp = fairSharePerUser / # of apps of that user;
  else
fairSharePerApp = UL / # of apps of that user;
{code}
So, according to the above formula, Say we have a scenario with 2 users and 
UserLimit = 100%,

 

User1 (UL = 100%, fairSharePerUser = 50%)
 * App1 (fairSharePerApp = 50%)

User2 (UL = 100%, fairSharePerUser = 50%)
 * App2 (fairSharePerApp = 25%)
 * App3 (fairSharePerApp = 25%)

Do you see any shortcomings in this formula or can suggest better ways of 
handling multiple user issue? I would really appreciate it :)
cc: [~sunilg] [~wangda]

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-02-01 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276425#comment-17276425
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

[~epayne], thanks for catching this. This is a major bug in the code. I am 
already working for addressing multiple user scenarios and should be able to 
get a patch to fix this in a couple of days.

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10591) Allow USERLIMIT_FIRST as only valid configuration when FairOrdering is used

2021-01-21 Thread VADAGA ANANYO RAO (Jira)
VADAGA ANANYO RAO created YARN-10591:


 Summary: Allow USERLIMIT_FIRST as only valid configuration when 
FairOrdering is used
 Key: YARN-10591
 URL: https://issues.apache.org/jira/browse/YARN-10591
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacity scheduler
Reporter: VADAGA ANANYO RAO
Assignee: VADAGA ANANYO RAO


When FairOrderingPolicy is being used in CapacityScheduler, we should only 
allow for USERLIMIT_FIRST.

The alternate option to USERLIMIT_FIRST is PRIORITY_FIRST. However, application 
priorities create anti-patterns with fairness. So we should not allow for 
PRIORITY_FIRST to be set.

This Jira is to add validation check to ensure only USERLIMIT_FIRST is set if 
FairOrderingPolicy is being used.

cc: [~sunilg] [~leftnoteasy] [~epayne]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-20 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0008.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch, 
> YARN-10559.0008.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-19 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0007.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch, YARN-10559.0007.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-18 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0006.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-18 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: (was: YARN-10559.0006.patch)

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-18 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0006.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-18 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: (was: YARN-10559.0006.patch)

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-18 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267394#comment-17267394
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

Updated latest patch to skip user headroom calculations in case of 
FairOrderingPolicy. This was missed in the previous patch.

Skipping user headroom calculation is required because 2 jobs from the same 
user may have unfair resource distribution and so we may still want to consider 
a user for preemption, even if the user has reached its max headroom.

cc: [~sunilg]

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-18 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0006.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch, YARN-10559.0006.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-12 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263252#comment-17263252
 ] 

VADAGA ANANYO RAO edited comment on YARN-10559 at 1/12/21, 11:09 AM:
-

Just FYI, the UT failure is not related to the code changes in this JIRA.


was (Author: ananyo_rao):
Just FYI, the UT failure is not related to the code changes.

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-12 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263252#comment-17263252
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

Just FYI, the UT failure is not related to the code changes.

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-11 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263135#comment-17263135
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

Thank you [~sunilg] and [~wangda] for the comments.

I have tried addressing the comments in the new patch.

Following are the changes:
 # Fixed the *checkstyle* warnings.
 # Added *check if 0 active apps* are there in the leaf queue. _Note:_ we won't 
have to check for 0 active apps in users because users are temp objects 
initialised when apps are getting created. So each user will at least have 1 
corresponding active app.
 # Ensured *fair-share is non-negative* value. However, fair-share can be 0 
value. So not adding checks for that. Fair-share can be 0 in following 
situations _(its a non-exhaustive list of conditions)_:
 ## User-limit is somehow 0 for a user
 ## queueReassignableResources are 0
 ## No running apps in the queue
 # If a container needs to be skipped from preemption in 
*skipContainerBasedOnIntraQueuePolicy*, we will continue checks for other 
containers of the app instead of breaking from the app.
 # Added additional *logs* for when we reduce actually_to_be_preempted of an 
app.

Thank you [~sunilg] for catching these points. Please let me know if some 
points needs to be addressed.

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-11 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0005.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-11 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: (was: YARN-10559.0005.patch)

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-11 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0005.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch, 
> YARN-10559.0005.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-10 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0004.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch, YARN-10559.0004.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-10 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0003.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch, YARN-10559.0003.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-08 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0002.patch

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-08 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: (was: YARN-10559.0002.patch)

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-07 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17261080#comment-17261080
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

[~sunilg] I have updated the latest patch and design doc. The FairShare 
calculation for preemption in this patch is:
{code:java}
fairSharePerApp = total Queue Cap / no: of apps in the queue
idealFairSharePerAppwithUL = UL / no: of apps of that user
if(apps of user * fairSharePerApp > UL)
   fairSharePerApp = idealFairSharePerAppwithUL;
{code}
Say we have 2 users and following scenario1 with UL = 100%:

 
{noformat}
User1: (UL: 100%)
-App1 (used = 50%, pending = 100%, FS = 33%)
-App3 (used = 50%, pending = 100%, FS = 33%)
User2 (UL: 100%)
-App3 (used = 0%, pending = 100%, FS = 33%)
{noformat}
We will take 17% resources from app1 and app2 to give to app3.

Now, say we have 2 users with UL = 50%:
{noformat}
User1: (UL: 50%)
-App1 (used = 50%, pending = 100%, FS = 25%)
-App3 (used = 50%, pending = 100%, FS = 25%)
User2 (UL: 50%)
-App3 (used = 0%, pending = 100%, FS = 33%)

{noformat}
We will take ~17% resources from app1 and app2 to give to app3. We may be 
under-preempting from user1 in this case. But we ensure user2 is not heavily 
starved.
This logic seems to be functioning closest to the FairOrderingPolicy in 
CapacityScheduler.

Comparisons of other considered algorithms is present in design_doc_v2.
Once again, thank you [~sunilg]  for guiding with the design.
[~epayne] [~leftnoteasy] Please do review the latest patch-2.
Thank you.

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-07 Thread VADAGA ANANYO RAO (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VADAGA ANANYO RAO updated YARN-10559:
-
Attachment: YARN-10559.0002.patch
FairOP_preemption-design_doc_v2.pdf

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Assignee: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> FairOP_preemption-design_doc_v2.pdf, YARN-10559.0001.patch, 
> YARN-10559.0002.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-04 Thread VADAGA ANANYO RAO (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258682#comment-17258682
 ] 

VADAGA ANANYO RAO commented on YARN-10559:
--

[~sunilg] Thank you for helping with the design. Can you please review the 
patch uploaded? In patch-0001, the fair-share for preemption is calculated as: 
*"FairShare per app = UserLimit of the user of the app / Number of apps for 
that user"*.

+ [~wangda]

> Fair sharing intra-queue preemption support in Capacity Scheduler
> -
>
> Key: YARN-10559
> URL: https://issues.apache.org/jira/browse/YARN-10559
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.1.4
>Reporter: VADAGA ANANYO RAO
>Priority: Major
> Fix For: 3.1.4
>
> Attachments: FairOP_preemption-design_doc_v1.pdf, 
> YARN-10559.0001.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Usecase:
> Due to the way Capacity Scheduler preemption works, If a single user submits 
> a large application to a queue (using 100% of resources), that job will not 
> be preempted by future applications from the same user within the same queue. 
> This implies that the later applications will be forced to wait for 
> completion of the long running application. This prevents multiple long 
> running, large, applications from running concurrently.
> Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10559) Fair sharing intra-queue preemption support in Capacity Scheduler

2021-01-04 Thread VADAGA ANANYO RAO (Jira)
VADAGA ANANYO RAO created YARN-10559:


 Summary: Fair sharing intra-queue preemption support in Capacity 
Scheduler
 Key: YARN-10559
 URL: https://issues.apache.org/jira/browse/YARN-10559
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 3.1.4
Reporter: VADAGA ANANYO RAO
 Fix For: 3.1.4
 Attachments: FairOP_preemption-design_doc_v1.pdf

Usecase:

Due to the way Capacity Scheduler preemption works, If a single user submits a 
large application to a queue (using 100% of resources), that job will not be 
preempted by future applications from the same user within the same queue. This 
implies that the later applications will be forced to wait for completion of 
the long running application. This prevents multiple long running, large, 
applications from running concurrently.

Support fair sharing among apps while preempting applications from same queue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org