[ 
https://issues.apache.org/jira/browse/FLINK-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16396761#comment-16396761
 ] 

ASF GitHub Bot commented on FLINK-8934:
---------------------------------------

Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5687#discussion_r174060373
  
    --- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/slotpool/SlotPoolTest.java
 ---
    @@ -505,17 +513,21 @@ public void 
testFulfillingSlotRequestsWithUnusedOfferedSlots() throws Exception
                        } catch (ExecutionException ee) {
                                // expected
                                
assertTrue(ExceptionUtils.stripExecutionException(ee) instanceof 
FlinkException);
    -
                        }
     
    -                   final SlotOffer slotOffer = new SlotOffer(allocationId, 
0, ResourceProfile.UNKNOWN);
    +                   assertEquals(allocationId1, 
canceledSlotRequests.take());
    +
    +                   final SlotOffer slotOffer = new 
SlotOffer(allocationId1, 0, ResourceProfile.UNKNOWN);
     
                        
slotPoolGateway.registerTaskManager(taskManagerLocation.getResourceID()).get();
     
                        
assertTrue(slotPoolGateway.offerSlot(taskManagerLocation, taskManagerGateway, 
slotOffer).get());
     
                        // the slot offer should fulfill the second slot request
    -                   assertEquals(allocationId, 
slotFuture2.get().getAllocationId());
    +                   assertEquals(allocationId1, 
slotFuture2.get().getAllocationId());
    +
    +                   // check that the second slot request has been canceled
    --- End diff --
    
    Let's see if i understood the scenario here correctly:
    
    We request the allocation of 2 slots. We cancel the first allocation 
request, but a TaskManager has already offered a slot to fulfill it.
    We now have one pending allocation request and one offered slot, so we 
re-use the slot for the second request, which we can do since both requests 
were for the same job with the same resource requirements.
    
    I think we can improve the wording a bit though, as this comment here says 
that the second request has been canceled, when just above it was fulfilled. I 
guess it should say that the slot _acquisition_ (i.e. the retrieval of slots 
from the RM) has been canceled. (also applies to the canceledSlotRequests 
variable)


> Cancel slot requests for otherwisely fulfilled requests
> -------------------------------------------------------
>
>                 Key: FLINK-8934
>                 URL: https://issues.apache.org/jira/browse/FLINK-8934
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Critical
>              Labels: flip-6
>             Fix For: 1.5.0, 1.6.0
>
>
> If a slot request is fulfilled with a different allocation id, then we should 
> cancel the other slot request at the {{ResourceManager}}. Otherwise we might 
> have some stale slot requests which first need to time out before slots are 
> available again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to