[
https://issues.apache.org/jira/browse/YUNIKORN-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Weiwei Yang resolved YUNIKORN-574.
----------------------------------
Fix Version/s: 0.10
Resolution: Fixed
> Wait for placeholder cleanup
> ----------------------------
>
> Key: YUNIKORN-574
> URL: https://issues.apache.org/jira/browse/YUNIKORN-574
> Project: Apache YuniKorn
> Issue Type: Sub-task
> Components: core - scheduler
> Reporter: Wilfred Spiegelenburg
> Assignee: Kinga Marton
> Priority: Critical
> Labels: pull-request-available
> Fix For: 0.10
>
>
> When we cleanup the application in the {{timeoutPlaceholderProcessing()}} we
> have two cases.
> * First case we clean up all lingering placeholder allocations on the
> running app
> * Second case is the fail of the which cleans up lingering asks no response
> needed from the shim) and all placeholders after which we fail the app.
> The cleanup of the placeholders in both these cases are instigated by the
> core and we need to wait for the cleanup to happen on the shim side before we
> proceed. It is not like the remove of the app signalled by the RM. This comes
> as an unexpected request for the shim not when the app is deleted on the shim
> side.
> For case 1 we do not have a problem. The placeholders are terminated and the
> app runs as per normal and is not moved to Completed until all is finished.
> We do NOT have an issue in the states leading to Completed as we have already
> handled it there (see below)
> For the failure case we immediately unlink the queue as we move into the
> FAILED state. As the move calls the {{moveTerminatedApp()}} via the callback.
> That causes an issue. We should be waiting for the shim to respond back to
> the core with the confirmation of the removal.
> This might require a new state to do this in two steps: trigger the cleanup
> move to Failing state, when all is cleaned up move to Failed.
> BTW: introducing a new state for Failing should also include the rename of
> Waiting to Completing as that is inline with what the state does and lines up
> between the two final states.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]