[ 
https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644301#comment-17644301
 ] 

Alaykumar Barochia commented on CASSANDRA-18075:
------------------------------------------------

I have run one more test at my end. This time I have 3 nodes cluster (3.11.4) 
in a single DC. Java 8 is being used here.

{noformat}
dbaasstg-ca-c3ssl-dc-690541-v001-8ouck:/usr/lib/cassandra/logs# nodetool status
Datacenter: c3ssl_dev_tap_ttc
=============================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns (effective)  Host ID            
                   Rack
UN  10.109.6.109   169.61 KiB  16           100.0%            
5a00a928-7933-430f-898e-c3d1fca6e026  rack1
UN  10.109.30.40   232.51 KiB  16           100.0%            
ac0c6d05-1a83-4f37-ac95-6337bcd7e32c  rack1
UN  10.109.28.213  131.59 KiB  16           100.0%            
48aa4f06-2d91-4733-858e-d935429176ea  rack1

Cluster Information:
        Name: c3ssl_dev
        Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
        DynamicEndPointSnitch: enabled
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
                60bb2474-bc8e-302e-b230-4e1f5e2453c3: [10.109.6.109, 
10.109.30.40, 10.109.28.213]
{noformat}

This time I am doing an upgrade on node 10.109.28.213 without changing the IP.
Followed below steps:
(1) Stop Cassandra service on 10.109.28.213
(2) Download C* 4 binary and replace it with C* 3 binary.
(3) Download Java 11 and make it default. 
(4) Modify the cassandra.yaml, cassandra-env.sh according to C* 4.
(5) Start Cassandra service.

Here, all steps are exact the same. The only difference is, IP is not changing.
And, I can see all worked well. No error in communicating with C* 3 nodes.


{noformat}
Datacenter: c3ssl_dev_tap_ttc
=============================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load        Tokens  Owns (effective)  Host ID                
               Rack
UN  10.109.6.109   205.66 KiB  16      100.0%            
5a00a928-7933-430f-898e-c3d1fca6e026  rack1
UN  10.109.28.213  242.06 KiB  16      100.0%            
48aa4f06-2d91-4733-858e-d935429176ea  rack1
UN  10.109.30.40   169.49 KiB  16      100.0%            
ac0c6d05-1a83-4f37-ac95-6337bcd7e32c  rack1

dbaasstg-ca-c3ssl-dc-861196-v001-r5mpg:/usr/lib/cassandra/logs# nodetool 
describecluster
Cluster Information:
        Name: c3ssl_dev
        Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
        DynamicEndPointSnitch: disabled
        Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
        Schema versions:
                db257004-5e1d-3ead-9f65-e5cd5e07a419: [10.109.28.213]

                60bb2474-bc8e-302e-b230-4e1f5e2453c3: [10.109.30.40, 
10.109.6.109]

Stats for all nodes:
        Live: 3
        Joining: 0
        Moving: 0
        Leaving: 0
        Unreachable: 0

Data Centers:
        c3ssl_dev_tap_ttc #Nodes: 3 #Down: 0

Database versions:
        3.11.4: [10.109.30.40:7000, 10.109.6.109:7000]

        4.0.4: [10.109.28.213:7000]

Keyspaces:
        system_schema -> Replication class: LocalStrategy {}
        system -> Replication class: LocalStrategy {}
        system_auth -> Replication class: NetworkTopologyStrategy 
{c3ssl_dev_tap_ttc=3}
        system_distributed -> Replication class: NetworkTopologyStrategy 
{c3ssl_dev_tap_ttc=3}
        system_traces -> Replication class: NetworkTopologyStrategy 
{c3ssl_dev_tap_ttc=3}

{noformat}
I am uploading cassandra.yaml, cassandra-env.sh and system.log from node 
10.109.28.213, before and after the upgrade.
File:  [^In-place-upgrade.zip] 

This proves that something is going wrong when IP is changing during the 
upgrade process.

Can you try to replicate this upgrade at your end where IP is changing during 
the upgrade process?

Thanks,
Alaykumar Barochia

> Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) 
> nodes during upgrade
> ---------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18075
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18075
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/Encryption
>            Reporter: Alaykumar Barochia
>            Priority: Normal
>         Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, 
> cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, 
> cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, 
> cassandra.yaml_404, system.log_10.110.44.207, 
> system.log_10.110.44.207_after_explicitely_set_port, 
> system.log_10.110.49.242_after_explicitely_set_port
>
>
> We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster 
> which is SSL enabled and facing an issue.
> Our cluster size is 3x3. 
> {noformat}
> Datacenter: abssl_dev_tap_ttc
> =============================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens       Owns (effective)  Host ID          
>                      Rack
> UN  10.109.6.153   94.27 KiB  16           100.0%            
> 130e59d2-2a9a-4039-a42f-deb20afcf288  rack1
> UN  10.109.45.8    104.43 KiB  16           100.0%            
> 35274a2c-f915-4308-9981-d207a4e2108f  rack1
> UN  10.109.66.149  104.23 KiB  16           100.0%            
> ea0151bc-fb6c-425d-af42-75c10e52f941  rack1
> Datacenter: abssl_dev_tap_tte
> =============================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens       Owns (effective)  Host ID          
>                      Rack
> UN  10.110.4.110   104.44 KiB  16           100.0%            
> fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554  rack1
> UN  10.110.44.220  99.33 KiB  16           100.0%            
> f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947  rack1
> UN  10.110.49.242  65.57 KiB  16           100.0%            
> 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd  rack1
> dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster
> Cluster Information:
>       Name: abssl_dev
>       Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>       DynamicEndPointSnitch: enabled
>       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>       Schema versions:
>               f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, 
> 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242]
> {noformat}
> During the upgrade, we re-run the pipeline in which we get new server (with 
> different IP) that will have Cassandra 4.0.4 binary. 
> Disk '/data' (contains data files, commitlogs etc.) will get detached from 
> the old server and get attached to the new server.
> This process works fine on non-SSL cluster but when we perform this on SSL 
> cluster, new node stops communicating with the rest of the nodes.
> In this example, after upgrade, node 10.110.4.110 got replaced with new 
> server with new IP 10.110.44.207.
> *Output from 3.11.4 node:*
> {noformat}
> dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i
> 10.109.6.153
> dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version
> openjdk version "1.8.0_322"
> OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06)
> OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode)
> dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status
> Datacenter: abssl_dev_tap_ttc
> =============================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens       Owns (effective)  Host ID          
>                      Rack
> UN  10.109.6.153   135.24 KiB  16           100.0%            
> 130e59d2-2a9a-4039-a42f-deb20afcf288  rack1
> UN  10.109.45.8    135.35 KiB  16           100.0%            
> 35274a2c-f915-4308-9981-d207a4e2108f  rack1
> UN  10.109.66.149  135.25 KiB  16           100.0%            
> ea0151bc-fb6c-425d-af42-75c10e52f941  rack1
> Datacenter: abssl_dev_tap_tte
> =============================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens       Owns (effective)  Host ID          
>                      Rack
> DN  10.110.4.110   104.44 KiB  16           100.0%            
> fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554  rack1
> UN  10.110.44.220  104.44 KiB  16           100.0%            
> f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947  rack1
> UN  10.110.49.242  65.57 KiB  16           100.0%            
> 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd  rack1
> dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool describecluster
> Cluster Information:
>       Name: abssl_dev
>       Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>       DynamicEndPointSnitch: enabled
>       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>       Schema versions:
>               f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.44.220, 
> 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242]
>               UNREACHABLE: [10.110.4.110]
> {noformat}
> *Output from 4.0.4 node:*
> {noformat}
> dbaasprod-ca-abssl-de-393671-v003-dxpyv:~# hostname -i
> 10.110.44.207
> dbaasprod-ca-abssl-de-393671-v003-dxpyv:~# java -version
> openjdk version "11.0.15" 2022-04-19
> OpenJDK Runtime Environment Temurin-11.0.15+10 (build 11.0.15+10)
> OpenJDK 64-Bit Server VM Temurin-11.0.15+10 (build 11.0.15+10, mixed mode)
> dbaasprod-ca-abssl-de-393671-v003-dxpyv:~# nodetool status
> Datacenter: DC1
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load        Tokens  Owns (effective)  Host ID              
>                  Rack
> DN  10.109.6.153   ?           16      0.0%              
> 130e59d2-2a9a-4039-a42f-deb20afcf288  r1
> DN  10.109.45.8    ?           16      0.0%              
> 35274a2c-f915-4308-9981-d207a4e2108f  r1
> DN  10.109.66.149  ?           16      0.0%              
> ea0151bc-fb6c-425d-af42-75c10e52f941  r1
> DN  10.110.44.220  ?           16      0.0%              
> f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947  r1
> DN  10.110.49.242  ?           16      0.0%              
> 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd  r1
> Datacenter: abssl_dev_tap_tte
> =============================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load        Tokens  Owns (effective)  Host ID              
>                  Rack
> UN  10.110.44.207  146.27 KiB  16      100.0%            
> fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554  rack1
> dbaasprod-ca-abssl-de-393671-v003-dxpyv:~# nodetool describecluster
> Cluster Information:
>       Name: abssl_dev
>       Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>       DynamicEndPointSnitch: disabled
>       Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>       Schema versions:
>               1ccaeb62-5816-3599-897f-de59fd56eef2: [10.110.44.207]
>               UNREACHABLE: [10.109.45.8, 10.109.66.149, 10.110.44.220, 
> 10.109.6.153, 10.110.49.242]
> Stats for all nodes:
>       Live: 1
>       Joining: 0
>       Moving: 0
>       Leaving: 0
>       Unreachable: 5
> Data Centers:
>       DC1 #Nodes: 5 #Down: 0
>       abssl_dev_tap_tte #Nodes: 1 #Down: 0
> Database versions:
>       : [10.109.45.8:7000, 10.109.66.149:7000, 10.110.44.220:7000, 
> 10.109.6.153:7000, 10.110.49.242:7000]
>       4.0.4: [10.110.44.207:7000]
> Keyspaces:
>       system_schema -> Replication class: LocalStrategy {}
>       system -> Replication class: LocalStrategy {}
>       system_auth -> Replication class: NetworkTopologyStrategy 
> {abssl_dev_tap_tte=3, abssl_dev_tap_ttc=3}
>       system_distributed -> Replication class: NetworkTopologyStrategy 
> {abssl_dev_tap_tte=3, abssl_dev_tap_ttc=3}
>       system_traces -> Replication class: NetworkTopologyStrategy 
> {abssl_dev_tap_tte=3, abssl_dev_tap_ttc=3}
> {noformat}
> Getting below error in system.log file of new node 10.110.44.207 which has 
> Cassandra version 4.0.4.
> {noformat}
> WARN  [Messaging-EventLoop-3-6] 2022-11-28 06:20:49,577 NoSpamLogger.java:95 
> - /10.110.44.207:7000->/10.109.45.8:7000-URGENT_MESSAGES-[no-channel] 
> dropping message of type GOSSIP_DIGEST_SYN whose timeout expired before 
> reaching the network
> INFO  [Messaging-EventLoop-3-6] 2022-11-28 06:21:17,921 NoSpamLogger.java:92 
> - /10.110.44.207:7000->/10.110.49.242:7000-URGENT_MESSAGES-[no-channel] 
> failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) 
> failed: Connection refused: /10.110.49.242:7000
> Caused by: java.net.ConnectException: finishConnect(..) failed: Connection 
> refused
>       at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
>       at io.netty.channel.unix.Socket.finishConnect(Socket.java:251)
>       at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:673)
>       at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:650)
>       at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:530)
>       at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:470)
>       at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>       at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>       at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>       at java.base/java.lang.Thread.run(Thread.java:829)
> {noformat}
> I am attaching the cassandra.yaml, cassandra-env.sh files from both versions 
> (3.11.4 and 4.0.4).
> Also attaching the system.log file from upgraded node 10.110.44.207.
> It seems like some bug and hence raising this Jira. Can you please have a 
> look?
> Let me know if you need any more details.
> Thanks,
> Alaykumar Barochia



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to