[ 
https://issues.apache.org/jira/browse/YUNIKORN-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Condit updated YUNIKORN-2099:
-----------------------------------
    Labels: release-notes  (was: )

> [Umbrella] State initialisation simplification (phase 2)
> --------------------------------------------------------
>
>                 Key: YUNIKORN-2099
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2099
>             Project: Apache YuniKorn
>          Issue Type: Improvement
>          Components: core - scheduler, shim - kubernetes
>            Reporter: Craig Condit
>            Assignee: Craig Condit
>            Priority: Major
>              Labels: release-notes
>             Fix For: 1.5.0
>
>
> Startup rebuilds all state of the cluster. This is called recovery. The name 
> is a bit misleading as it is not really recovery as it is loading the current 
> state. State initialisation is a better term to use.
> The current recovery code links the loading of applications and tasks (pods) 
> to node loading. This makes the recovery code complex and thus fragile. It 
> could, in a worst case scenario, lead to a pod not being recovered correctly.
> Recovery should be a step by step process that has boundaries and steps:
>  * load node
>  ** register nodes with the core
>  * load pods
>  ** create applications in core
>  ** register running pods as allocations with the core
>  ** register pending pods as asks with the core
>  * process changes for nodes and pods
>  * start scheduling
> No nodes, applications or asks on existing apps should be declined. Even if 
> theĀ  queue does not exist a running application must be added and handled. 
> The current rejection of an application if it cannot be placed in the queue 
> is an incorrect behaviour.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

Reply via email to