Weiwei Yang created YUNIKORN-677:
------------------------------------

             Summary: Potential resource leak when complete and allocate pod 
happens simultaneously
                 Key: YUNIKORN-677
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-677
             Project: Apache YuniKorn
          Issue Type: Bug
            Reporter: Weiwei Yang


Let's say we have an app that has 1 pod needs for scheduling. The shim submits 
an app to the core, and start the schedule the pod. In the shim side, this is a 
task in the Scheduling state. Then we have a race if the following things 
happen simultaneously:
# User deletes the pod, this triggers a CompleteTask event in the shim side, 
and the shim will send a ReleaseAllocationAskRequest to the core.
# Before handling the ReleaseAllocationAskRequest from the shim, the core made 
an allocation for the given pod and send an Allocation to the shim

then the core generates an allocation on a node, core receives the release 
request and deletes the pending ask; the shim side receives the new allocation, 
but since the pod has already been deleted so the shim ignores this allocation. 
In this case, the allocation will be left-over causing the resource leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: dev-h...@yunikorn.apache.org

Reply via email to