[ https://issues.apache.org/jira/browse/CASSANDRA-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842228#comment-17842228 ]
Bret McGuire commented on CASSANDRA-19598: ------------------------------------------ No worries [~shot_up] , you're good... and thanks for bringing it to my attention! > advanced.resolve-contact-points: unresolved hostname being clobbered during > reconnection > ---------------------------------------------------------------------------------------- > > Key: CASSANDRA-19598 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19598 > Project: Cassandra > Issue Type: Bug > Components: Client/java-driver > Reporter: Andrew Orlowski > Priority: Normal > Attachments: image-2024-04-29-20-13-56-161.png, > image-2024-04-29-20-40-53-382.png > > > Hello, this is a bug ticket for 4.18.0 of the Java driver. > > I am running in an environment where I have 3 Cassandra nodes. We have a use > case to redeploy the cluster from the ground up at midnight every day. This > means that all 3 nodes become unavailable for a short period of time and 3 > new nodes with 3 new ip addresses get spun up and placed behind the contact > point hostname. If you set {{advanced.resolve-contact-points}} to FALSE, the > java driver should re-resolve the hostname for every new connection to that > node. This occurs prior to and for the first redeployment, but the unresolved > hostname is clobbered during the reconnection process and replaced with a > resolved IP address, making additional redeployments fruitless. We provide a > singular hostname as a contact point. > > In our case, what is happening is that all 3 nodes become unavailable while > our CICD process is destroying the existing cluster and replacing it with a > new one. During the window of unavailability, the Java driver attempts to > reconnect to each node, two of which internally (internal to the driver) have > resolved IP addresses and one of which retains the unresolved hostname. Here > is a screenshot that captures the internal state of the 3 nodes within > `PoolManager` prior to the finished redeployment of the cluster. Note that > there are 2 resolved IP addresses and 1 unresolved hostname. > !image-2024-04-29-20-13-56-161.png|width=985,height=181! > This ratio of resolved IP:unresolved hostname is the correct internal state > for a 3 node cluster when `advanced.resolve-contact-points` is set to `FALSE`. > Eventually, the hostname points to one of the 3 new valid nodes, and the java > driver reconnects and discovers the new peers. However, as part of this > reconnection process, the internal Node that held the unresolved hostname is > now overwritten with a Node that has the resolved IP address: > !image-2024-04-29-20-40-53-382.png|width=1080,height=102! > Note that we no longer have 2 resolved IP addresses and 1 unresolved > hostname; rather, we have 3 resolved IP addresses, which is an incorrect > internal state when `advanced.resolve-contact-points` is set to `FALSE`. One > of the nodes should have retained the unresolved hostname. > At this stage, the Java driver no longer queries the hostname for new > connections, and further redeployments of ours result in failure because the > hostname is no longer amongst the list of nodes that are queried for > reconnection. This causes us to need to restart the application. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org