[ 
https://issues.apache.org/jira/browse/CASSANDRA-16677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376305#comment-17376305
 ] 

Berenguer Blasi commented on CASSANDRA-16677:
---------------------------------------------

The problem comes from trying to flush against an already closed connecttion. 
Adding some stacks for future reference:

{noformat}
[junit-timeout] INFO  [Messaging-EventLoop-3-7] 2021-07-06 12:17:23,858 
InboundConnectionInitiator.java:464 - 
/127.0.0.1:7012(/127.0.0.1:49576)->/127.0.0.1:7012-LARGE_MESSAGES-6cbecdc7 
messaging connection established, version = 12, framing = LZ4, encryption = 
encrypted(factory=openssl;protocol=TLSv1.3;cipher=TLS_AES_128_GCM_SHA256)
[junit-timeout] INFO  [main] 2021-07-06 12:17:23,865 
InboundConnectionInitiator.java:127 - Listening on address: (/127.0.0.1:17012), 
nic: lo, encryption: encrypted(openssl)
[junit-timeout] INFO  [main] 2021-07-06 12:17:23,865 
InboundConnectionInitiator.java:127 - Listening on address: (/127.0.0.1:7012), 
nic: lo, encryption: optionally encrypted(openssl)
[junit-timeout] WARN  
[Messaging-OUT-/127.0.0.1:7012->/127.0.0.1:7012-LARGE_MESSAGES] 2021-07-06 
12:17:23,868 OutboundConnection.java:488 - 
/127.0.0.1:7012->/127.0.0.1:7012-LARGE_MESSAGES-[no-channel] dropping message 
of type _TEST_1 due to error
[junit-timeout] org.apache.cassandra.net.AsyncChannelOutputPlus$FlushException: 
This output stream is in an unsafe state after an asynchronous flush failed
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.propagateFailedFlush(AsyncChannelOutputPlus.java:201)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.waitUntilFlushed(AsyncChannelOutputPlus.java:158)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.flush(AsyncChannelOutputPlus.java:230)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.close(AsyncChannelOutputPlus.java:248)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncMessageOutputPlus.close(AsyncMessageOutputPlus.java:110)
[junit-timeout]         at 
org.apache.cassandra.net.OutboundConnection$LargeMessageDelivery.doRun(OutboundConnection.java:986)
[junit-timeout]         at 
org.apache.cassandra.net.OutboundConnection$Delivery.run(OutboundConnection.java:687)
[junit-timeout]         at 
org.apache.cassandra.net.OutboundConnection$LargeMessageDelivery.run(OutboundConnection.java:955)
[junit-timeout]         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[junit-timeout]         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[junit-timeout]         at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[junit-timeout]         at java.lang.Thread.run(Thread.java:748)
[junit-timeout] Caused by: io.netty.handler.ssl.SslClosedEngineException: 
SSLEngine closed already
[junit-timeout]         at 
io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:854)
[junit-timeout]         at 
io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:811)
[junit-timeout]         at 
io.netty.handler.ssl.SslHandler.flush(SslHandler.java:792)
[junit-timeout]         at 
io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:750)
[junit-timeout]         at 
io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:742)
[junit-timeout]         at 
io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:728)
[junit-timeout]         at 
io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:125)
[junit-timeout]         at 
io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:750)
[junit-timeout]         at 
io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:765)
[junit-timeout]         at 
io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1071)
[junit-timeout]         at 
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
[junit-timeout]         at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
[junit-timeout]         at 
io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384)
[junit-timeout]         at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
[junit-timeout]         at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
[junit-timeout]         ... 2 common frames omitted
{noformat}

{noformat}
[junit-timeout] org.apache.cassandra.net.AsyncChannelOutputPlus$FlushException: 
The channel this output stream was writing to has been closed
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.propagateFailedFlush(AsyncChannelOutputPlus.java:200)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.waitUntilFlushed(AsyncChannelOutputPlus.java:158)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.flush(AsyncChannelOutputPlus.java:230)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncChannelOutputPlus.close(AsyncChannelOutputPlus.java:248)
[junit-timeout]         at 
org.apache.cassandra.net.AsyncMessageOutputPlus.close(AsyncMessageOutputPlus.java:110)
[junit-timeout]         at 
org.apache.cassandra.net.OutboundConnection$LargeMessageDelivery.doRun(OutboundConnection.java:986)
[junit-timeout]         at 
org.apache.cassandra.net.OutboundConnection$Delivery.run(OutboundConnection.java:687)
[junit-timeout]         at 
org.apache.cassandra.net.OutboundConnection$LargeMessageDelivery.run(OutboundConnection.java:955)
[junit-timeout]         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[junit-timeout]         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[junit-timeout]         at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[junit-timeout]         at java.lang.Thread.run(Thread.java:748)
[junit-timeout] Caused by: io.netty.channel.StacklessClosedChannelException: 
null
[junit-timeout]         at 
io.netty.channel.AbstractChannel$AbstractUnsafe.write(Object, 
ChannelPromise)(Unknown Source)
{noformat}

Seems like waiting on {{AsyncPromise.isDone()}} on {{outbound.close}} fixes it 
but idk why yet.

> Fix flaky ConnectionTest
> ------------------------
>
>                 Key: CASSANDRA-16677
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16677
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Messaging/Internode
>            Reporter: Brandon Williams
>            Assignee: Berenguer Blasi
>            Priority: Normal
>             Fix For: 4.0, 4.0-rc
>
>
> https://ci-cassandra.apache.org/job/Cassandra-devbranch/785/testReport/junit/org.apache.cassandra.net/ConnectionTest/testMessageDeliveryOnReconnect_compression/
> https://app.circleci.com/pipelines/github/adelapena/cassandra/460/workflows/cf7dcec6-612c-45d1-8471-623bde481dca/jobs/4069
> https://app.circleci.com/pipelines/github/adelapena/cassandra/460/workflows/b750cd38-0263-4b5e-9bb8-a8be98214378/jobs/4065



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to