[ 
https://issues.apache.org/jira/browse/CASSANDRA-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Orlowski updated CASSANDRA-19598:
----------------------------------------
    Description: 
Hello, this is a bug ticket for 4.18.0 of the Java driver.

 

I am running in an environment where I have 3 Cassandra nodes. We have a use 
case to redeploy the cluster from the ground up at midnight every day. This 
means that all 3 nodes become unavailable for a short period of time and 3 new 
nodes with 3 new ip addresses get spun up and placed behind the contact point 
hostname. If you set {{advanced.resolve-contact-points}} to FALSE, the java 
driver should re-resolve the hostname for every new connection to that node. 
This occurs for the prior to and for the first redeployment, but the unresolved 
hostname is clobbered during the reconnection process and replaced with a 
resolved IP address.

 

In our case, what is happening is that all 3 nodes become unavailable while our 
CICD process is destroying the existing cluster and replacing it with a new 
one. The Java driver attempts to reconnect to each node, two of which 
internally (internal to the driver) have resolved IP addresses and one of which 
retains the unresolved hostname. Here is a screenshot that captures the 
internal state of the 3 nodes within `PoolManager` prior to the redeployment of 
the cluster. Note that there are 2 resolved IP addresses and 1 unresolved 
hostname.

!image-2024-04-29-20-13-56-161.png!

This ratio of resolved IP:unresolved hostname is the correct internal state for 
a 3 node cluster when `advanced.resolve-contact-points` is set to `FALSE`.

Eventually, the hostname points to one of the 3 new valid nodes, and the java 
driver reconnects and resets the pool. However, as part of this reconnection 
process, the internal Node that held the unresolved hostname is now overwritten 
with a Node that has the resolved IP address:
!image-2024-04-29-20-40-53-382.png!
Note that we no longer have 2 resolved IP addresses and 1 unresolved hostname; 
rather, we have 3 resolved IP addresses, which is an incorrect internal state 
when `advanced.resolve-contact-points` is set to `FALSE`.

At this stage, the Java driver no longer queries the hostname for new 
connections, and further redeployments of ours result in failure because the 
hostname is no longer amongst the list of nodes that are queried for 
reconnection. This causes us to need to restart the application. 

  was:
Hello, this is a bug ticket for 4.18.0 of the Java driver.

 

I am running in an environment where I have 3 Cassandra nodes. We have a use 
case to redeploy the cluster from the ground up at midnight every day. This 
means that all 3 nodes become unavailable for a short period of time and 3 new 
nodes with 3 new ip addresses get spun up and placed behind the contact point 
hostname. If you set {{advanced.resolve-contact-points}} to FALSE, the java 
driver should re-resolve the hostname for every new connection to that node. 
This occurs for the first redeployment, but the unresolved hostname is 
clobbered during the reconnection process and replaced with a resolved IP 
address.

 

In our case, what is happening is that all 3 nodes become unavailable while our 
CICD process is destroying the existing cluster and replacing it with a new 
one. The Java driver attempts to reconnect to each node, two of which 
internally (internal to the driver) have resolved IP addresses and one of which 
retains the unresolved hostname. Here is a screenshot that captures the 
internal state of the 3 nodes within `PoolManager` prior to the redeployment of 
the cluster. Note that there are 2 resolved IP addresses and 1 unresolved 
hostname.

!image-2024-04-29-20-13-56-161.png!

This ratio of resolved IP:unresolved hostname is the correct internal state for 
a 3 node cluster when `advanced.resolve-contact-points` is set to `FALSE`.

Eventually, the hostname points to one of the 3 new valid nodes, and the java 
driver reconnects and resets the pool. However, as part of this reconnection 
process, the internal Node that held the unresolved hostname is now overwritten 
with a Node that has the resolved IP address:
!image-2024-04-29-20-40-53-382.png!
Note that we no longer have 2 resolved IP addresses and 1 unresolved hostname; 
rather, we have 3 resolved IP addresses, which is an incorrect internal state 
when `advanced.resolve-contact-points` is set to `FALSE`.

At this stage, the Java driver no longer queries the hostname for new 
connections, and further redeployments of ours result in failure because the 
hostname is no longer amongst the list of nodes that are queried for 
reconnection. This causes us to need to restart the application. 


> advanced.resolve-contact-points: unresolved hostname being clobbered during 
> reconnection
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-19598
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19598
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Client/java-driver
>            Reporter: Andrew Orlowski
>            Priority: Normal
>         Attachments: image-2024-04-29-20-13-56-161.png, 
> image-2024-04-29-20-40-53-382.png
>
>
> Hello, this is a bug ticket for 4.18.0 of the Java driver.
>  
> I am running in an environment where I have 3 Cassandra nodes. We have a use 
> case to redeploy the cluster from the ground up at midnight every day. This 
> means that all 3 nodes become unavailable for a short period of time and 3 
> new nodes with 3 new ip addresses get spun up and placed behind the contact 
> point hostname. If you set {{advanced.resolve-contact-points}} to FALSE, the 
> java driver should re-resolve the hostname for every new connection to that 
> node. This occurs for the prior to and for the first redeployment, but the 
> unresolved hostname is clobbered during the reconnection process and replaced 
> with a resolved IP address.
>  
> In our case, what is happening is that all 3 nodes become unavailable while 
> our CICD process is destroying the existing cluster and replacing it with a 
> new one. The Java driver attempts to reconnect to each node, two of which 
> internally (internal to the driver) have resolved IP addresses and one of 
> which retains the unresolved hostname. Here is a screenshot that captures the 
> internal state of the 3 nodes within `PoolManager` prior to the redeployment 
> of the cluster. Note that there are 2 resolved IP addresses and 1 unresolved 
> hostname.
> !image-2024-04-29-20-13-56-161.png!
> This ratio of resolved IP:unresolved hostname is the correct internal state 
> for a 3 node cluster when `advanced.resolve-contact-points` is set to `FALSE`.
> Eventually, the hostname points to one of the 3 new valid nodes, and the java 
> driver reconnects and resets the pool. However, as part of this reconnection 
> process, the internal Node that held the unresolved hostname is now 
> overwritten with a Node that has the resolved IP address:
> !image-2024-04-29-20-40-53-382.png!
> Note that we no longer have 2 resolved IP addresses and 1 unresolved 
> hostname; rather, we have 3 resolved IP addresses, which is an incorrect 
> internal state when `advanced.resolve-contact-points` is set to `FALSE`.
> At this stage, the Java driver no longer queries the hostname for new 
> connections, and further redeployments of ours result in failure because the 
> hostname is no longer amongst the list of nodes that are queried for 
> reconnection. This causes us to need to restart the application. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to