[ https://issues.apache.org/jira/browse/YUNIKORN-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437361#comment-17437361 ]
Manikandan R commented on YUNIKORN-462: --------------------------------------- While we are making progess on other sub tasks, had a offline discussion with [~wilfreds] baased on earlier comments to bring closure on this. Discussion was mostly around "It is good to merge the code i.e moving the assumepod/forgotpod methods with add allocations/release allocaitons code flow as discussed earlier, But should we do this synchronously?" [~wwei] Can you please take a look at this? Summary of the offline discussion ( copy pasting) is, If the core is updated, the checks in the core will fail before we even call out to the shim to check anything. For instance if we make an allocation in the core the node is updated. The next allocation will thus see a node with less resources.If we call out to the shim to make any checks it means that it should fit even after the shim is updated. The delete will trigger the updates in the shim cache needed when it gets processed. The allocation will do the same. That should be guaranteed by the shim code. The sync code was introduced when there were two layers in the core: a cache and the scheduler. The core could do things while the scheduler cache was not updated. That could cause strange issues. Now we have just one layer in the scheduler. The core now always sees the right info. If an allocation is made all objects in the core are consistent. That is what the decision is based on. Only affinity kind of predicates rely on the shim data. I thus think that the chance that a delete or new allocation affect the outcome of the checks is minimal. When the delete or allocation is processed the shim should be up to where the core is. In the time that the shim could be behind the core is more restrictive than the shim in its checks > Streamline core to shim update on allocation change > --------------------------------------------------- > > Key: YUNIKORN-462 > URL: https://issues.apache.org/jira/browse/YUNIKORN-462 > Project: Apache YuniKorn > Issue Type: Sub-task > Components: core - scheduler, shim - kubernetes > Reporter: Wilfred Spiegelenburg > Assignee: Manikandan R > Priority: Major > > Currently in the scheduler we have two updates that get send to the shim when > an allocation is added or released: > * event to shim RM event handler to allocate > * reconciler plugin to update the shim caches > Before YUNIKORN-317 one update was made in the cace the other in the > scheduler. Now they are both in the scheduler in quick succession. The cache > update in the shim is needed to make sure that the predicates are seeing the > correct info. The event does the real bind etc of the allocation on the node. > We should be able to fold the two calls into one call. However this requires > changes on both sides and might even impact the SI as it will likely become a > synced event call. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org For additional commands, e-mail: issues-h...@yunikorn.apache.org