[jira] [Commented] (FLINK-6434) There may be allocatedSlots leak in SlotPool

Till Rohrmann (JIRA) Fri, 05 May 2017 02:03:21 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997961#comment-15997961
 ]


Till Rohrmann commented on FLINK-6434:
--------------------------------------

Thanks for reporting the issue [~tiemsn]. This sounds like a bug and should be 
fixed.

I think we could solve it the following way: We generate the {{AllocationID}} 
in {{ProviderAndOwner#allocateSlot}} and pass it to 
{{SlotPoolGateway#allocateSlot}}. On the returned future we register an 
exception handler which will call {{SlotPoolGateway#failAllocation}} with the 
generated {{AllocationID}}. That way we should be able to deal with timeouts on 
the {{Execution}} side. What do you think?

> There may be allocatedSlots leak in SlotPool
> --------------------------------------------
>
>                 Key: FLINK-6434
>                 URL: https://issues.apache.org/jira/browse/FLINK-6434
>             Project: Flink
>          Issue Type: Bug
>          Components: Cluster Management
>            Reporter: shuai.xu
>            Assignee: shuai.xu
>              Labels: flip-6
>
> If the call allocateSlot() from Execution to Slotpool timeout, the job will 
> begin to failover, but the pending request are still in SlotPool, if then a 
> new slot register to SlotPool, it may be fulfill the outdated pending request 
> and be added to allocatedSlots, but it will never be used and will never be 
> recycled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-6434) There may be allocatedSlots leak in SlotPool

Reply via email to