[ https://issues.apache.org/jira/browse/CASSANDRA-18560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams reassigned CASSANDRA-18560: -------------------------------------------- Assignee: Brandon Williams > Incorrect IP used for gossip across DCs with prefer_local=true > -------------------------------------------------------------- > > Key: CASSANDRA-18560 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18560 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip > Reporter: Brad Vernon > Assignee: Brandon Williams > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.x > > > After installing a new node using 4.0.10 we experienced a situation where the > new node attempted to connect to the private ip of a random number of nodes > remote DCs which are only accessible via public ip for cross dc > communications. > The only impact was new nodes outbound connections, inbound from pre-4.0.10 > were not affected. system.peers_v2 (below) showed that the preferred_ip and > preferred_port as null, only those in 4.0.10 nodes dc have perferred_ip > values as expected. > We believe the issue originated with > https://issues.apache.org/jira/browse/CASSANDRA-16718 > Details on cluster: > * All nodes have public IP configured as well as private IP > * Listen/rpc addressrs are configured for private ip, broadcast is public IP > * prefer_local=true is enabled for all nodes > The log that showed the connection failing: > {code:java} > INFO [Messaging-EventLoop-3-8] 2023-06-01 00:14:21,565 NoSpamLogger.java:92 > - > /99.81.<redacted>:7000->/44.208.<redacted>:7000-URGENT_MESSAGES-[no-channel] > failed to connectio.netty.channel.ConnectTimeoutException: connection timed > out: /10.26.5.11:7000 at > io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:576){code} > 99 and 44 instances can only access each other using public ips. > gossipinfo output from 4.0.10 node > {code:java} > /44.208.<redacted> > generation:1661113358 > heartbeat:25267691 > LOAD:25267683:1.7882044268E10 > SCHEMA:24692061:e98b918d-499f-3ccc-8dbe-5af31f685bda > DC:13:us-east-1 > RACK:15:1a > RELEASE_VERSION:6:4.0.5 > NET_VERSION:2:12 > HOST_ID:3:9a41e668-060d-4cfe-bb1e-013f5116422d > RPC_READY:1407:true > INTERNAL_ADDRESS_AND_PORT:9:10.26.5.11:7000 > NATIVE_ADDRESS_AND_PORT:4:44.208.<redacted>:9042 > STATUS_WITH_PORT:1393:NORMAL,-2262036356854762881 > SSTABLE_VERSIONS:7:big-nb > TOKENS:1392:<hidden> {code} > Peers output from 4.0.10 node: > {code:java} > peer | peer_port | data_center | host_id > | native_address | native_port | preferred_ip | preferred_port | > rack | release_version | schema_version | > tokens----------------+-----------+---------------------+--------------------------------------+----------------+-------------+--------------+----------------+------+-----------------+--------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > 44.208.<redacted> | 7000 | us-east-1 | > 9a41e668-060d-4cfe-bb1e-013f5116422d | 44.208.<redacted> | 9042 | > null | null | 1a | 4.0.5 | > e98b918d-499f-3ccc-8dbe-5af31f685bda | {'-2262036356854762881', > '-4197710115038136897', '-7072386316096662315', '2085255826742630980', > '249732489387853170', '4976300208126705818', '7187184456885833289', > '8777189009399731927'} {code} > To solve temporarily we routed outbound traffic to the private ip to public > using iptables which resulted in successful outbound connections. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org