[ 
https://issues.apache.org/jira/browse/FLINK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703114#comment-14703114
 ] 

Ufuk Celebi commented on FLINK-2540:
------------------------------------

Debugging this. It is not a bug in the local buffer pool (from what I can tell 
so far). The stack trace shows that the call is a blocking request. That's why 
there is the infinite loop.

The problem is that the buffers seem to not be consumed/recycled at some later 
point. I'm trying to figure out where that happens now.

> LocalBufferPool.requestBuffer gets into infinite loop
> -----------------------------------------------------
>
>                 Key: FLINK-2540
>                 URL: https://issues.apache.org/jira/browse/FLINK-2540
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>            Reporter: Gabor Gevay
>            Assignee: Ufuk Celebi
>            Priority: Blocker
>             Fix For: 0.10, 0.9.1
>
>
> I'm trying to run a complicated computation that looks like this: [1].
> One of the DataSource->Filter->Map chains finishes fine, but the other one 
> freezes. Debugging shows that it is spinning in the while loop in 
> LocalBufferPool.requestBuffer.
> askToRecycle is false. Both numberOfRequestedMemorySegments and 
> currentPoolSize is 128, so it never goes into that if either.
> This is a stack trace: [2]
> And here is the code, if you would like to run it: [3]. Unfortunately, I 
> can't make it more minimal, becuase if I remove some operators, the problem 
> disappears. The class to start is malom.Solver. (On first run, it calculates 
> some lookuptables for a few minutes, and puts them into /tmp/movegen)
> [1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt
> [2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt
> [3] https://github.com/ggevay/flink/tree/deadlock-malom



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to