[ https://issues.apache.org/jira/browse/FLINK-17182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087003#comment-17087003 ]
Yun Gao commented on FLINK-17182: --------------------------------- This should be caused by when the unfulfilled buffer is fulfilled by recycled exclusive buffers, the status of waiting on buffer pool is not cleared, thus it could not reuse buffer returned by itself to buffer pool. A simulated case would be: Suppose initially both the input channel and buffer pool have no buffers, which the case in this test. We also temporarily ignore the initial credit, which does not affect the issue. # inputChannel.onSenderBacklog(2), which makes the input channel start waiting on the buffer pool # one buffer is recycled to the buffer pool and assigned to the input channel. # one exclusive buffer is recycle and returned to the input channel. With current implementation the input channel will not stop waiting, even if its available buffers are already equal to the required. # one exclusive buffer is recycled and return to the input channel. input channel will return one buffer to the buffer pool. # inputChannel.onSenderBacklog(3), thus one more buffer is required, however, since the channel is still waiting on the buffer pool, it cannot request the available buffer in the buffer pool. There is also no other chances to assign this buffer to the input channel. Thus, one buffer is left in the buffer pool and cannot assign to the input channel. > RemoteInputChannelTest.testConcurrentOnSenderBacklogAndRecycle fail on azure > ---------------------------------------------------------------------------- > > Key: FLINK-17182 > URL: https://issues.apache.org/jira/browse/FLINK-17182 > Project: Flink > Issue Type: Bug > Components: Runtime / Network > Reporter: Dawid Wysakowicz > Assignee: Yun Gao > Priority: Critical > Labels: test-stability > Fix For: 1.11.0 > > > https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7546&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=d2c1c472-9d7b-5913-b8e4-461f3092fb7a > {code} > [ERROR] Tests run: 21, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 3.943 s <<< FAILURE! - in > org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannelTest > [ERROR] > testConcurrentOnSenderBacklogAndRecycle(org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannelTest) > Time elapsed: 0.011 s <<< FAILURE! > java.lang.AssertionError: There should be 248 buffers available in channel. > expected:<248> but was:<238> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at > org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannelTest.testConcurrentOnSenderBacklogAndRecycle(RemoteInputChannelTest.java:869) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)