[jira] [Commented] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

2020-06-15 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136306#comment-17136306
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-99:
---

Updated design document with all issues fixed in a pdf form [^FIFO Scheduling 
for batch workloads.pdf] 

> Enhanced FIFO scheduling for batch workloads
> 
>
> Key: YUNIKORN-99
> URL: https://issues.apache.org/jira/browse/YUNIKORN-99
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Weiwei Yang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
> Attachments: FIFO Scheduling for batch workloads.pdf
>
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

2020-06-09 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129974#comment-17129974
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-99:
---

PR created for documentation

> Enhanced FIFO scheduling for batch workloads
> 
>
> Key: YUNIKORN-99
> URL: https://issues.apache.org/jira/browse/YUNIKORN-99
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Weiwei Yang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

2020-05-22 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114130#comment-17114130
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-99:
---

code changes committed, leaving open to add documentation

> Enhanced FIFO scheduling for batch workloads
> 
>
> Key: YUNIKORN-99
> URL: https://issues.apache.org/jira/browse/YUNIKORN-99
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Weiwei Yang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.9
>
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

2020-05-20 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112208#comment-17112208
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-99:
---

First part of the documentation which will need to be added when done:
*Application states*
* New: new app that is incomplete, from here the app transitions into the 
accepted state when it is completed (i.e. add an ask)
* Accepted: the application is ready and part of the scheduling cycle. On 
allocation of the first ask the app moves into a starting state.
* Starting: the app has exactly one running container/pod. The application 
transitions to running after more allocations are added to the app or if no 
more no pods are expected (timeout or just one cont/pod for the app)
* Running: the normal state for the application, Containers/pods can start and 
stop and are scheduled when requested.
* Waiting: An app that has no pending asks or running containers/pod will be 
Waiting for new asks to be added and then move back into Running
* Completed: The app has signalled it is done and not expected to return
* Killed: Removed by the shim at the request of an admin or user
* Rejected: The application is rejected when it was added to the scheduler. 
This only happens when a shim tries to add an app, when it gets created in a 
New state, and the scheduler rejects the creation

> Enhanced FIFO scheduling for batch workloads
> 
>
> Key: YUNIKORN-99
> URL: https://issues.apache.org/jira/browse/YUNIKORN-99
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Weiwei Yang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org



[jira] [Commented] (YUNIKORN-99) Enhanced FIFO scheduling for batch workloads

2020-05-13 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YUNIKORN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106362#comment-17106362
 ] 

Wilfred Spiegelenburg commented on YUNIKORN-99:
---

PR opened with the base changes for the state aware scheduling of applications. 

Two new states added to the app: waiting and starting. An app is now moving 
from a new state to an accepted state when the first ask is added to the 
application. As soon as that ask is allocated the app moves from accepted to 
starting. From starting the app moves to running. An app can stay in the 
starting state for a maximum of 5 minutes or it moves before that if more 
allocations are added to the application. The rest of the time the app will 
spend in the running state.
It can leave the running state and move to waiting if there are no outstanding 
asks and no allocations for that app. That means the app is not done but there 
is nothing scheduled for the app. If a new ask gets added the app moves back to 
running and gets scheduled as normal.
Applications can be killed or marked as completed by the RM if it can determine 
the state. The scheduler does not move the app to those state itself.

The state aware policy for applications in a queue leverages the new state to 
sort the applications. The logic for sorting applications in a queue is as 
follows:
- only apps with pending resources are scheduled
- apps are sorted based on submission time, oldest app first
- all running applications are candidates
- a maximum of one app in the starting state will be added to the list of 
running apps
- if the queue contains no (0) starting apps, with or without pending 
resources, the oldest app in the accepted state will be added

The queue will then use that list of apps to schedule.

On recovery apps that have existing allocations are considered to be in a 
running state.

> Enhanced FIFO scheduling for batch workloads
> 
>
> Key: YUNIKORN-99
> URL: https://issues.apache.org/jira/browse/YUNIKORN-99
> Project: Apache YuniKorn
>  Issue Type: New Feature
>  Components: core - scheduler
>Reporter: Weiwei Yang
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: pull-request-available
>
> An enhanced version of FIFO scheduling for batch workloads



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org