[ 
https://issues.apache.org/jira/browse/CASSANDRA-19696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kan Maung updated CASSANDRA-19696:
----------------------------------
    Component/s: Cluster/Gossip

> Observed large number of Inbound / Outbound connection disconnect / 
> reconnects in log
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19696
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19696
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip
>            Reporter: Kan Maung
>            Priority: Normal
>
> We are seeing hundreds of InboundConnection established / closed messages on 
> several of our clusters running Apache Cassandra 4.0.10.  Looking at 
> 'nodetool tpstats' it seems gossip is close to the time out value.  It 
> affects both the LargeMessage and UrgentMessage connections.
> Gossiper uses MessagingService to send messages from the source to 
> destination using OutboundConnection.  Depending on the message type 
> especially for LARGE_MESSAGES it is enqueued in a separate thread pool while 
> URGENT_MESSAGES are delivered with Verb.Priority.P0.
> In the example below this happens just 20 seconds after it connected. These 
> two nodes are in the same datacenter, so there should be no geographical 
> latency between them. This cluster 111 has a very standard cassandra.yaml for 
> our environment.
>  
> 127.10.20.88 cassandra.log:
> 2024-05-13 02:06:13,805 [INFO ] [Messaging-EventLoop-3-2] cluster_id=111 
> ip_address=127.10.20.88 InboundConnectionInitiator.java:529 - 
> /127.10.30.171:7000(/127.10.30.171:37404)->/127.10.20.88:7000-URGENT_MESSAGES-e039a471
>  messaging connection established, version = 12, framing = CRC, encryption = 
> encrypted(...)
> 2024-05-13 02:06:32,201 [INFO ] [Messaging-EventLoop-3-4] cluster_id=111 
> ip_address=127.10.20.88 OutboundConnection.java:1059 - 
> /127.10.20.88:7000->/169.73.115.189:7000-LARGE_MESSAGES-70634968 channel 
> closed by provider
>  
> 127.10.30.171 log:
> 2024-05-13 02:05:00,300 [INFO ] [Messaging-EventLoop-3-2] cluster_id=111 
> ip_address=127.10.30.171 OutboundConnection.java:1059 - 
> /127.10.30.171:7000->/169.102.147.87:7000-LARGE_MESSAGES-4b3ea69f channel 
> closed by provider
> io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: 
> Connection timed out
> 2024-05-13 02:05:46,892 [INFO ] [Messaging-EventLoop-3-4] cluster_id=111 
> ip_address=127.10.30.171 OutboundConnection.java:1059 - 
> /127.10.30.171:7000->/127.10.20.88:7000-URGENT_MESSAGES-8fd0dbf2 channel 
> closed by provider
> io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: 
> Connection timed out
> 2024-05-13 02:06:13,804 [INFO ] [Messaging-EventLoop-3-4] cluster_id=111 
> ip_address=127.10.30.171 OutboundConnection.java:1153 - 
> /127.10.30.171:7000(/127.10.30.171:37404)->/127.10.20.88:7000-URGENT_MESSAGES-155d9869
>  successfully connected, version = 12, framing = CRC, encryption = 
> encrypted(...)
> 2024-05-13 02:06:24,281 [INFO ] [Messaging-EventLoop-3-4] cluster_id=111 
> ip_address=127.10.30.171 OutboundConnection.java:1153 - 
> /127.10.30.171:7000(/127.10.30.171:50046)->/169.73.137.223:7000-LARGE_MESSAGES-04b51284
>  successfully connected, version = 12, framing = LZ4, encryption = 
> encrypted(...)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to