[ 
https://issues.apache.org/jira/browse/FLINK-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-4543.
-------------------------------

> Race Deadlock in SpilledSubpartitionViewTest
> --------------------------------------------
>
>                 Key: FLINK-4543
>                 URL: https://issues.apache.org/jira/browse/FLINK-4543
>             Project: Flink
>          Issue Type: Improvement
>          Components: Network
>    Affects Versions: 1.1.2
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 1.2.0
>
>
> The test deadlocked (Java level deadlock) with the following stack traces:
> {code}
> Found one Java-level deadlock:
> =============================
> "pool-1-thread-2":
>   waiting to lock monitor 0x00007fec2c006168 (object 0x00000000ef661c20, a 
> java.lang.Object),
>   which is held by "IOManager reader thread #1"
> "IOManager reader thread #1":
>   waiting to lock monitor 0x00007fec2c005ea8 (object 0x00000000ef62c8a8, a 
> java.lang.Object),
>   which is held by "pool-1-thread-2"
> Java stack information for the threads listed above:
> ===================================================
> "pool-1-thread-2":
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.notifyError(SpilledSubpartitionViewAsyncIO.java:309)
>         - waiting to lock <0x00000000ef661c20> (a java.lang.Object)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.onAvailableBuffer(SpilledSubpartitionViewAsyncIO.java:261)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$300(SpilledSubpartitionViewAsyncIO.java:42)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:380)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$BufferProviderCallback.onEvent(SpilledSubpartitionViewAsyncIO.java:366)
>         at 
> org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:135)
>         - locked <0x00000000ef62c8a8> (a java.lang.Object)
>         at 
> org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:118)
>         - locked <0x00000000ef9597c0> (a java.lang.Object)
>         at 
> org.apache.flink.runtime.io.network.util.TestConsumerCallback$RecyclingCallback.onBuffer(TestConsumerCallback.java:72)
>         at 
> org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:87)
>         at 
> org.apache.flink.runtime.io.network.util.TestSubpartitionConsumer.call(TestSubpartitionConsumer.java:39)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> "IOManager reader thread #1":
>         at 
> org.apache.flink.runtime.io.network.util.TestPooledBufferProvider$PooledBufferProviderRecycler.recycle(TestPooledBufferProvider.java:126)
>         - waiting to lock <0x00000000ef62c8a8> (a java.lang.Object)
>         at 
> org.apache.flink.runtime.io.network.buffer.Buffer.recycle(Buffer.java:118)
>         - locked <0x00000000efa016f0> (a java.lang.Object)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.returnBufferFromIOThread(SpilledSubpartitionViewAsyncIO.java:275)
>         - locked <0x00000000ef661c20> (a java.lang.Object)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO.access$100(SpilledSubpartitionViewAsyncIO.java:42)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:343)
>         at 
> org.apache.flink.runtime.io.network.partition.SpilledSubpartitionViewAsyncIO$IOThreadCallback.requestSuccessful(SpilledSubpartitionViewAsyncIO.java:333)
>         at 
> org.apache.flink.runtime.io.disk.iomanager.AsynchronousFileIOChannel.handleProcessedBuffer(AsynchronousFileIOChannel.java:199)
>         at 
> org.apache.flink.runtime.io.disk.iomanager.BufferReadRequest.requestDone(AsynchronousFileIOChannel.java:435)
>         at 
> org.apache.flink.runtime.io.disk.iomanager.IOManagerAsync$ReaderThread.run(IOManagerAsync.java:408)
> Found 1 deadlock.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to