[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17722053#comment-17722053 ] Alaykumar Barochia commented on CASSANDRA-18075: I already tried option: 1 by setting ssl_storage_port to 7001 on 4.0 node. It didn't help. Also, we have already firewall port open for both ports 7000 and 7001 in TAP so option:2 is also ruled out. Today, I tried the opposite, make 3.11 cluster to use 7000 for SSL/TLS and then tried upgrade to 4.0. Still the same issue. *3.11.4 cluster: (Have set ssl_storage_port: 7000)* {noformat} Datacenter: c3ssl_dev_tap_ttc = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns (effective) Host ID Rack UN 10.109.44.76 128.32 KiB 16 63.1% 325e24b3-81b9-4d19-abaf-1bb61f662be5 rack1 UN 10.109.30.228 153.03 KiB 16 71.1% 4d1ff6ec-d781-474d-9862-4b31f1f583fe rack1 UN 10.109.44.177 152.91 KiB 16 65.8% bb4000ce-8f87-4c8a-aefe-0bc26143c2d3 rack1 {noformat} Upgraded node {{10.109.30.228}} first. New IP {{10.109.220.200}}. *New node, stopped communicating with other nodes.* *From node 10.109.44.76 :* {noformat} Datacenter: c3ssl_dev_tap_ttc = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns (effective) Host ID Rack UN 10.109.44.76 128.32 KiB 16 63.1% 325e24b3-81b9-4d19-abaf-1bb61f662be5 rack1 DN 10.109.30.228 128.32 KiB 16 71.1% 4d1ff6ec-d781-474d-9862-4b31f1f583fe rack1 UN 10.109.44.177 128.22 KiB 16 65.8% bb4000ce-8f87-4c8a-aefe-0bc26143c2d3 rack1 {noformat} *From node 10.109.220.200 :* {noformat} dbaasstg-ca-c3ssl-dc-834204-v002-1s7rs:/usr/lib/cassandra/logs# nodetool status Datacenter: DC1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address LoadTokens Owns (effective) Host ID Rack DN 10.109.44.177 ? 16 65.8% bb4000ce-8f87-4c8a-aefe-0bc26143c2d3 r1 DN 10.109.44.76? 16 63.1% 325e24b3-81b9-4d19-abaf-1bb61f662be5 r1 Datacenter: c3ssl_dev_tap_ttc = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address LoadTokens Owns (effective) Host ID Rack UN 10.109.220.200 212.45 KiB 16 71.1% 4d1ff6ec-d781-474d-9862-4b31f1f583fe rack1 {noformat} > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describeclust
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721543#comment-17721543 ] Ryan Koski commented on CASSANDRA-18075: Aaron these nodes should not have any firewalls between them. The only thing that has any knowledge of ports is the LB that sit in front of the cluster. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 104.44 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.4
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721466#comment-17721466 ] Aaron Ploetz commented on CASSANDRA-18075: -- Brandon does have a good point though, about the 3.11 nodes rejecting connection on 7000. I'd suggest trying one of the following solutions: # Configure the 4.0 node to use TLS/SSL over 7001 (like 3.11 does). I can't tell for sure if this has been tried or not. # Create a firewall rule in TAP (Target Application Platform) that allows a port translation from 7000:7001 (4.0 node:7000 -> 3.11 node:7001). I have a feeling that the firewall is what's getting in the way here. In fact, a good test might be to try this in a dev environment without any firewall rules in place. If it works (I suspect that it will), then the firewall/port rules are the problem here. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0%
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646157#comment-17646157 ] Brandon Williams commented on CASSANDRA-18075: -- CASSANDRA-17744 had different symptoms, and they went away on their own. I don't see any dropped READ_REQ messages here, as in the title of the stackexchange post. I don't believe either of them had a refused connection on the storage port. bq. Then I upgraded the same cluster from 3.11.14 to 4.0.7 (latest version). Again the same issue. By same issue what exactly does that mean? It may be worth examining those logs as well. I think any viable theory needs to be framed with the evidence that is present. The connection being refused to the storage port is critical for communication and not a red herring that can be ignored. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151b
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17645989#comment-17645989 ] Alaykumar Barochia commented on CASSANDRA-18075: [~brandon.williams] - I still feel the way Cassandra 4 handles the gossip is different than Cassandra 3. I ran one more test where I upgraded SSL cluster from 3.11.4 to 3.11.14 (IP changes during the upgrade) and all went fine without any issue. Then I upgraded the same cluster from 3.11.14 to 4.0.7 (latest version). Again the same issue. I found others also reported similar issues during the upgrade. https://issues.apache.org/jira/browse/CASSANDRA-17744 https://dba.stackexchange.com/questions/319259/cassandra-4-0-dropping-message-of-type-read-req (IP changes during upgrade) If my infrastructure had an issue (firewall, port), I would have received the same issue when I upgrade from 3.11.4 to 3.11.14. It would be worth it if you try to reproduce at your end once. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644779#comment-17644779 ] Brandon Williams commented on CASSANDRA-18075: -- bq. Did you check the latest logs that I uploaded? I understood that upgrade to be successful, so there is no reason to check them. bq. Will you be able to try at your end once where IP is changing during the upgrade process? I'm not interested in exploring a high effort path that ignores the crucial bit of evidence we have, which is the connection refused to the storage port on the 3.11 side. That is the smoking gun, that is what must be explained to go any further here. And we should remember that being on the 3.11 side means that there's nothing we can change on the 4.0 side to fix it. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > D
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644605#comment-17644605 ] Alaykumar Barochia commented on CASSANDRA-18075: [~brandon.williams] - Did you check the latest logs that I uploaded? There is no node with IP 10.110.49.242 in my latest test. This is happening only with SSL cluster. Non-SSL cluster is getting upgraded (getting new IP during upgrade) without any issue. Will you be able to try at your end once where IP is changing during the upgrade process? > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 10
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644304#comment-17644304 ] Brandon Williams commented on CASSANDRA-18075: -- bq. This proves that something is going wrong when IP is changing during the upgrade process. I disagree, this only proves that the upgrade process works in this instance. There is no smoking gun, just the same old smoke. Nothing here changes the crux of the problem, which is still the connection being refused to 10.110.49.242:7000, which source IP address will have no effect on. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID >
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644301#comment-17644301 ] Alaykumar Barochia commented on CASSANDRA-18075: I have run one more test at my end. This time I have 3 nodes cluster (3.11.4) in a single DC. Java 8 is being used here. {noformat} dbaasstg-ca-c3ssl-dc-690541-v001-8ouck:/usr/lib/cassandra/logs# nodetool status Datacenter: c3ssl_dev_tap_ttc = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad Tokens Owns (effective) Host ID Rack UN 10.109.6.109 169.61 KiB 16 100.0% 5a00a928-7933-430f-898e-c3d1fca6e026 rack1 UN 10.109.30.40 232.51 KiB 16 100.0% ac0c6d05-1a83-4f37-ac95-6337bcd7e32c rack1 UN 10.109.28.213 131.59 KiB 16 100.0% 48aa4f06-2d91-4733-858e-d935429176ea rack1 Cluster Information: Name: c3ssl_dev Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch DynamicEndPointSnitch: enabled Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 60bb2474-bc8e-302e-b230-4e1f5e2453c3: [10.109.6.109, 10.109.30.40, 10.109.28.213] {noformat} This time I am doing an upgrade on node 10.109.28.213 without changing the IP. Followed below steps: (1) Stop Cassandra service on 10.109.28.213 (2) Download C* 4 binary and replace it with C* 3 binary. (3) Download Java 11 and make it default. (4) Modify the cassandra.yaml, cassandra-env.sh according to C* 4. (5) Start Cassandra service. Here, all steps are exact the same. The only difference is, IP is not changing. And, I can see all worked well. No error in communicating with C* 3 nodes. {noformat} Datacenter: c3ssl_dev_tap_ttc = Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoadTokens Owns (effective) Host ID Rack UN 10.109.6.109 205.66 KiB 16 100.0% 5a00a928-7933-430f-898e-c3d1fca6e026 rack1 UN 10.109.28.213 242.06 KiB 16 100.0% 48aa4f06-2d91-4733-858e-d935429176ea rack1 UN 10.109.30.40 169.49 KiB 16 100.0% ac0c6d05-1a83-4f37-ac95-6337bcd7e32c rack1 dbaasstg-ca-c3ssl-dc-861196-v001-r5mpg:/usr/lib/cassandra/logs# nodetool describecluster Cluster Information: Name: c3ssl_dev Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch DynamicEndPointSnitch: disabled Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: db257004-5e1d-3ead-9f65-e5cd5e07a419: [10.109.28.213] 60bb2474-bc8e-302e-b230-4e1f5e2453c3: [10.109.30.40, 10.109.6.109] Stats for all nodes: Live: 3 Joining: 0 Moving: 0 Leaving: 0 Unreachable: 0 Data Centers: c3ssl_dev_tap_ttc #Nodes: 3 #Down: 0 Database versions: 3.11.4: [10.109.30.40:7000, 10.109.6.109:7000] 4.0.4: [10.109.28.213:7000] Keyspaces: system_schema -> Replication class: LocalStrategy {} system -> Replication class: LocalStrategy {} system_auth -> Replication class: NetworkTopologyStrategy {c3ssl_dev_tap_ttc=3} system_distributed -> Replication class: NetworkTopologyStrategy {c3ssl_dev_tap_ttc=3} system_traces -> Replication class: NetworkTopologyStrategy {c3ssl_dev_tap_ttc=3} {noformat} I am uploading cassandra.yaml, cassandra-env.sh and system.log from node 10.109.28.213, before and after the upgrade. File: [^In-place-upgrade.zip] This proves that something is going wrong when IP is changing during the upgrade process. Can you try to replicate this upgrade at your end where IP is changing during the upgrade process? Thanks, Alaykumar Barochia > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: In-place-upgrade.zip, cassandra-env.sh_3114, > cassandra-env.sh_404, cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluste
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644059#comment-17644059 ] Brandon Williams commented on CASSANDRA-18075: -- if 10.110.49.242 is 3.11, and bq. Connection refused: /10.110.49.242:7000 the connection is being refused to it, there is either a problem on the 3.11 side listening or something with the network is making it appear that way. bq. In your test, do you get new IP during the upgrade process? No, but C* doesn't care about IPs, only host IDs and tokens. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 104.44 KiB 16
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643125#comment-17643125 ] Alaykumar Barochia commented on CASSANDRA-18075: [~brandon.williams] - I didn't miss the {{enable_legacy_ssl_storage_port}} parameter. It is set in C* 4 node 10.110.44.207. Check the attached file {{cassandra.yaml_10.110.44.207_explicitely_set_port}}. This parameter is not applicable on C* 3.11.4 and hence it is not set on 10.110.49.242. In your test, do you get new IP during the upgrade process? > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 104.44 KiB 16
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641463#comment-17641463 ] Brandon Williams commented on CASSANDRA-18075: -- I don't think we would expect anything to change by explicitly settings those two properties since they match the defaults, my greater point was that you should use a minimally modified yaml so that it's directly comparable to the stock yaml. That said, you are missing enable_legacy_ssl_storage_port this time. bq. So, something is going wrong (probably a bug) with SSL cluster communication during the upgrade process Here is a passing test that covers this scenario: https://ci-cassandra.apache.org/job/Cassandra-3.0/313/testReport/dtest-upgrade.upgrade_tests.upgrade_through_versions_test/TestUpgrade_indev_3_0_x_To_indev_4_0_x/test_parallel_upgrade_with_internode_ssl/ > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0%
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17641075#comment-17641075 ] Alaykumar Barochia commented on CASSANDRA-18075: Performed below steps: (1) Explicitely set 'storage_port' and 'ssl_storage_port' in cassandra.yaml file on all C* 3.11.4 nodes (10.110.44.220, 10.110.49.242, 10.109.66.149, 10.109.45.8, 10.109.6.153) and restart cassandra service. (2) Explicitely set 'storage_port' and 'ssl_storage_port' in cassandra.yaml file on 10.110.44.207 (C* 4.0.4) and restart cassandra service. Still getting the same error on C* 4 node 10.110.44.207. Uploaded cassandra.yaml and system.log files for your review. [^cassandra.yaml_10.110.44.207_explicitely_set_port] [^cassandra.yaml_10.110.49.242_explicitely_set_port] [^system.log_10.110.44.207_after_explicitely_set_port] [^system.log_10.110.49.242_after_explicitely_set_port] To me, it looks like, because we are getting new IP during the upgrade is causing this issue. But as I said earlier, we are not getting this issue on non-SSL cluster, only SSL cluster is causing this issue. So, something is going wrong (probably a bug) with SSL cluster communication during the upgrade process whenever we get new IP during the upgrade process. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_10.110.44.207_explicitely_set_port, > cassandra.yaml_10.110.49.242_explicitely_set_port, cassandra.yaml_3114, > cassandra.yaml_404, system.log_10.110.44.207, > system.log_10.110.44.207_after_explicitely_set_port, > system.log_10.110.49.242_after_explicitely_set_port > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640808#comment-17640808 ] Brandon Williams commented on CASSANDRA-18075: -- bq. Whenever we do not define storage_port and ssl_storage_port explicitly, it takes the default value I'm aware, but explicit is better than implicit, and if this is indeed a bug, removing noise from the setup process is beneficial. bq. Connection refused: /10.110.49.242:7000 I would focus on this node's config since clearly it is not binding 7000. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_3114, cassandra.yaml_404, system.log_10.110.44.207 > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 104.44 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640802#comment-17640802 ] Alaykumar Barochia commented on CASSANDRA-18075: [~brandon.williams] - Whenever we do not define {{storage_port}} and {{ssl_storage_port}} explicitly, it takes the default value, which is {{7000}} and {{7001}} respectively. In my setup also, on both 3.11.4 and 4.0.4 versions, it is taking these values only. Below is the snippet from system.log file from both version which clearly shows {{7000}} and {{7001}} port is being used for {{storage_port}} and {{ssl_storage_port}} respectively. *4.0.4:* {noformat} 47; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=0.0.0.0; rpc_interface=null; rpc_interface_prefer_ipv6=false; rpc_keepalive=true; saved_caches_directory=/data/saved_caches; seed_provider=org.apache.cassandra.locator.SimpleSeedProvider{seeds=10.109.45.8,10.109.6.153,10.110.44.207,10.110.44.220}; server_encryption_options=; slow_query_log_timeout_in_ms=500; snapshot_before_compaction=false; snapshot_links_per_second=0; snapshot_on_duplicate_row_detection=false; snapshot_on_repaired_data_mismatch=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; storage_port=7000; stream_entire_sstables=true; stream_throughput_outbound_megabits_per_sec=200; streaming_connections_per_host=1; streaming_keep_alive_period_in_secs=300; table_count_warn_threshold=150; tombstone_failure_threshold=10; tombstone_warn_threshold=1000; tracetype_query_ttl=86400; tracetype_repair_ttl=604800; transparent_data_encryption_options=org.apache.cassandra.conf {noformat} *3.11.4* {noformat} es=null; rpc_send_buff_size_in_bytes=null; rpc_server_type=sync; saved_caches_directory=/data/saved_caches; seed_provider=org.apache.cassandra.locator.SimpleSeedProvider{seeds=10.109.45.8,10.109.6.153,10.110.4.110,10.110.44.220}; server_encryption_options=; slow_query_log_timeout_in_ms=500; snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=false; storage_port=7000; stream_throughput_outbound_megabits_per_sec=200; streaming_keep_alive_period_in_secs=300; streaming_socket_timeout_in_ms=8640; thrift_framed_transport_size_in_mb=15 {noformat} > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_3114, cassandra.yaml_404, system.log_10.110.44.207 > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contain
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640664#comment-17640664 ] Brandon Williams commented on CASSANDRA-18075: -- Your yaml is so stripped down it is missing basic definitions like storage_port and ssl_storage_port so it's not clear which ports are being used. I would suggest reproducing the problem with the minimal amount of changes needed to do, so that everything is still explicitly defined. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_3114, cassandra.yaml_404, system.log_10.110.44.207 > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 104.44 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 >
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640381#comment-17640381 ] Alaykumar Barochia commented on CASSANDRA-18075: [~brandon.williams] - I know about this parameter and have set it to {{true}} in my C*4 cassandra.yaml file. {noformat} server_encryption_options: internode_encryption: all enable_legacy_ssl_storage_port: true keystore: /data/conf/keystore keystore_password: truststore: /data/conf/truststore truststore_password: {noformat} You can verify the same in the attached file {{cassandra.yaml_404}}. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_3114, cassandra.yaml_404, system.log_10.110.44.207 > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 104.44 KiB 16 100.0% > f1dc35c
[jira] [Commented] (CASSANDRA-18075) Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) nodes during upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640110#comment-17640110 ] Brandon Williams commented on CASSANDRA-18075: -- I think you may have overlooked [this|https://github.com/apache/cassandra/blob/cassandra-4.0/NEWS.txt#L302] part of NEWS.txt and need to set [this|https://github.com/apache/cassandra/blob/cassandra-4.0/conf/cassandra.yaml#L1127] option. > Upgraded (C* 4.0.4) node stops communicating with older version (3.11.4) > nodes during upgrade > - > > Key: CASSANDRA-18075 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18075 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption >Reporter: Alaykumar Barochia >Priority: Normal > Attachments: cassandra-env.sh_3114, cassandra-env.sh_404, > cassandra.yaml_3114, cassandra.yaml_404, system.log_10.110.44.207 > > > We are testing upgrade from Cassandra 3.11.4 to 4.0.4 on our test cluster > which is SSL enabled and facing an issue. > Our cluster size is 3x3. > {noformat} > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 94.27 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8104.43 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 104.23 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 99.33 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-de-393671-v001-yqlvf:~# nodetool describecluster > Cluster Information: > Name: abssl_dev > Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch > DynamicEndPointSnitch: enabled > Partitioner: org.apache.cassandra.dht.Murmur3Partitioner > Schema versions: > f68fbc0c-c9d6-3709-8075-c5a0d74192f2: [10.110.4.110, > 10.110.44.220, 10.109.6.153, 10.109.45.8, 10.109.66.149, 10.110.49.242] > {noformat} > During the upgrade, we re-run the pipeline in which we get new server (with > different IP) that will have Cassandra 4.0.4 binary. > Disk '/data' (contains data files, commitlogs etc.) will get detached from > the old server and get attached to the new server. > This process works fine on non-SSL cluster but when we perform this on SSL > cluster, new node stops communicating with the rest of the nodes. > In this example, after upgrade, node 10.110.4.110 got replaced with new > server with new IP 10.110.44.207. > *Output from 3.11.4 node:* > {noformat} > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# hostname -i > 10.109.6.153 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# java -version > openjdk version "1.8.0_322" > OpenJDK Runtime Environment (Temurin)(build 1.8.0_322-b06) > OpenJDK 64-Bit Server VM (Temurin)(build 25.322-b06, mixed mode) > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nodetool status > Datacenter: abssl_dev_tap_ttc > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > UN 10.109.6.153 135.24 KiB 16 100.0% > 130e59d2-2a9a-4039-a42f-deb20afcf288 rack1 > UN 10.109.45.8135.35 KiB 16 100.0% > 35274a2c-f915-4308-9981-d207a4e2108f rack1 > UN 10.109.66.149 135.25 KiB 16 100.0% > ea0151bc-fb6c-425d-af42-75c10e52f941 rack1 > Datacenter: abssl_dev_tap_tte > = > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- AddressLoad Tokens Owns (effective) Host ID > Rack > DN 10.110.4.110 104.44 KiB 16 100.0% > fd4a9fa8-f2a9-494c-afb8-7cb8a08c7554 rack1 > UN 10.110.44.220 104.44 KiB 16 100.0% > f1dc35c0-a1c2-45fe-9f65-b1cc3d7f6947 rack1 > UN 10.110.49.242 65.57 KiB 16 100.0% > 72bc4ae5-876d-4d0a-91ac-6cf8b531b4dd rack1 > dbaasprod-ca-abssl-dc-437097-v001-7mump:~# nod