[ 
https://issues.apache.org/jira/browse/YUNIKORN-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YUNIKORN-2370:
-----------------------------------
    Summary: Proper event handling for failed headroom checks  (was: Handle 
events when headroom checks fail on a per-request basis)

> Proper event handling for failed headroom checks
> ------------------------------------------------
>
>                 Key: YUNIKORN-2370
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2370
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: core - scheduler
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>
> Currently, we have this code inside {{Application.tryAllocate()}} (some lines 
> removed for clarity):
> {noformat}
> func (sa *Application) tryAllocate(headRoom *resources.Resource, 
> allowPreemption bool, preemptionDelay time.Duration, preemptAttemptsRemaining 
> *int, nodeIterator func() NodeIterator, fullNodeIterator func() NodeIterator, 
> getNodeFn func(string) *Node) *Allocation {
>         ...
>       userHeadroom := ugm.GetUserManager().Headroom(sa.queuePath, 
> sa.ApplicationID, sa.user)
>       // get all the requests from the app sorted in order
>       for _, request := range sa.sortedRequests {
>               ...
>               if !userHeadroom.FitInMaxUndef(request.GetAllocatedResource()) {
>                       continue
>               }
>               // resource must fit in headroom otherwise skip the request 
> (unless preemption could help)
>               if !headRoom.FitInMaxUndef(request.GetAllocatedResource()) {
>                       // attempt preemption
>                       if allowPreemption && *preemptAttemptsRemaining > 0 {
>                               ...
>                       }
>                       sa.appEvents.sendAppDoesNotFitEvent(request, headRoom)  
>  <--- event
>                       continue
>               }
> {noformat}
> There are issues with this approach:
> 1. We say "the application doesn't fit" while it's really the request that 
> doesn't fit.
> 2. If there's no quota at all, then a request gets its own event, but the 
> rest don't.
> Suggested approach:
> 1. Have a per-request event
> 2. When an event is sent (eg. failed user headroom) for a given request, 
> remember it and don't send it anymore



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

Reply via email to