[ https://issues.apache.org/jira/browse/FLINK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703114#comment-14703114 ]
Ufuk Celebi commented on FLINK-2540: ------------------------------------ Debugging this. It is not a bug in the local buffer pool (from what I can tell so far). The stack trace shows that the call is a blocking request. That's why there is the infinite loop. The problem is that the buffers seem to not be consumed/recycled at some later point. I'm trying to figure out where that happens now. > LocalBufferPool.requestBuffer gets into infinite loop > ----------------------------------------------------- > > Key: FLINK-2540 > URL: https://issues.apache.org/jira/browse/FLINK-2540 > Project: Flink > Issue Type: Bug > Components: Core > Reporter: Gabor Gevay > Assignee: Ufuk Celebi > Priority: Blocker > Fix For: 0.10, 0.9.1 > > > I'm trying to run a complicated computation that looks like this: [1]. > One of the DataSource->Filter->Map chains finishes fine, but the other one > freezes. Debugging shows that it is spinning in the while loop in > LocalBufferPool.requestBuffer. > askToRecycle is false. Both numberOfRequestedMemorySegments and > currentPoolSize is 128, so it never goes into that if either. > This is a stack trace: [2] > And here is the code, if you would like to run it: [3]. Unfortunately, I > can't make it more minimal, becuase if I remove some operators, the problem > disappears. The class to start is malom.Solver. (On first run, it calculates > some lookuptables for a few minutes, and puts them into /tmp/movegen) > [1] http://compalg.inf.elte.hu/~ggevay/flink/plan.txt > [2] http://compalg.inf.elte.hu/~ggevay/flink/stacktrace.txt > [3] https://github.com/ggevay/flink/tree/deadlock-malom -- This message was sent by Atlassian JIRA (v6.3.4#6332)