[
https://issues.apache.org/jira/browse/CASSANDRA-14848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689456#comment-16689456
]
Tommy Stendahl commented on CASSANDRA-14848:
--------------------------------------------
I have created a patch that allow 4.0 nodes to connect to all 3.x nodes, its
available here:
[cassandra-14848|[https://github.com/tommystendahl/cassandra/tree/cassandra-14848].]
Unfortunately I got another exception in the log of the old nodes:
{noformat}
2018-11-16T13:48:15.165+0100 [MessagingService-Incoming-/10.216.193.242] ERROR
o.a.c.service.CassandraDaemon$2:223 uncaughtException Exception in thread
Thread[MessagingService-Incoming-/10.216.193.242,5,main]
java.lang.RuntimeException: Unknown column additional_write_policy during
deserialization
at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:433)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:440)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:190)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:686)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:674)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:337)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:346)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:641)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:624)
~[apache-cassandra-3.0.17.jar:3.0.17]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
~[apache-cassandra-3.0.17.jar:3.0.17]
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
~[apache-cassandra-3.0.17.jar:3.0.17]{noformat}
It appears once or twice about one minute after the old node has detected the
new node as being UP:
{noformat}
2018-11-16T13:47:15.148+0100 [GossipStage:1] INFO
org.apache.cassandra.gms.Gossiper:1040 handleMajorStateChange Node
/10.216.193.242 has restarted, now UP
2018-11-16T13:47:15.149+0100 [GossipStage:1] INFO
o.a.cassandra.service.StorageService:2024 handleStateNormal Node
/10.216.193.242 state jump to NORMAL
2018-11-16T13:48:15.165+0100 [MessagingService-Incoming-/10.216.193.242] ERROR
o.a.c.service.CassandraDaemon$2:223 uncaughtException Exception in thread
Thread[MessagingService-Incoming-/10.216.193.242,5,main]
java.lang.RuntimeException: Unknown column additional_write_policy during
deserialization{noformat}
So far I have not found this to cause any problems besides printing an
unexpected exception in the log. Also I'm not sure if we should consider this a
new issue or if my patch is wrong (or missing something).
> When upgrading 3.11.3->4.0 using SSL 4.0 nodes does not connect to old non
> seed nodes
> -------------------------------------------------------------------------------------
>
> Key: CASSANDRA-14848
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14848
> Project: Cassandra
> Issue Type: Bug
> Components: Streaming and Messaging
> Reporter: Tommy Stendahl
> Priority: Major
> Labels: security
>
> When upgrading from 3.11.3 to 4.0 with server encryption enabled the new 4.0
> node only connects to 3.11.3 seed node, there are no connection established
> to non-seed nodes on the old version.
> I have four nodes, *.242 is upgraded to 4.0, *.243 and *.244 are 3.11.3
> non-seed and *.246 are 3.11.3 seed. After starting the 4.0 node I get this
> nodetool status on the different nodes:
> {noformat}
> *.242
> -- Address Load Tokens Owns (effective) Host ID Rack
> UN 10.216.193.242 1017.77 KiB 256 75,1% 7d278e14-d549-42f3-840d-77cfd852fbf4
> RAC1
> DN 10.216.193.243 743.32 KiB 256 74,8% 5586243a-ca74-4125-8e7e-09e82e23c4e5
> RAC1
> DN 10.216.193.244 711.54 KiB 256 75,2% c155e262-b898-4e86-9e1d-d4d0f97e88f6
> RAC1
> UN 10.216.193.246 659.81 KiB 256 74,9% 502dd00f-fc02-4024-b65f-b98ba3808291
> RAC1
> *.243 and *.244
> -- Address Load Tokens Owns (effective) Host ID Rack
> DN 10.216.193.242 657.4 KiB 256 75,1% 7d278e14-d549-42f3-840d-77cfd852fbf4
> RAC1
> UN 10.216.193.243 471 KiB 256 74,8% 5586243a-ca74-4125-8e7e-09e82e23c4e5 RAC1
> UN 10.216.193.244 471.71 KiB 256 75,2% c155e262-b898-4e86-9e1d-d4d0f97e88f6
> RAC1
> UN 10.216.193.246 388.54 KiB 256 74,9% 502dd00f-fc02-4024-b65f-b98ba3808291
> RAC1
> *.246
> -- Address Load Tokens Owns (effective) Host ID Rack
> UN 10.216.193.242 657.4 KiB 256 75,1% 7d278e14-d549-42f3-840d-77cfd852fbf4
> RAC1
> UN 10.216.193.243 471 KiB 256 74,8% 5586243a-ca74-4125-8e7e-09e82e23c4e5 RAC1
> UN 10.216.193.244 471.71 KiB 256 75,2% c155e262-b898-4e86-9e1d-d4d0f97e88f6
> RAC1
> UN 10.216.193.246 388.54 KiB 256 74,9% 502dd00f-fc02-4024-b65f-b98ba3808291
> RAC1
> {noformat}
>
> I have built 4.0 with wire tracing activated and in my config the
> storage_port=12700 and ssl_storage_port=12701. In the log I can see that the
> 4.0 node start to connect to the 3.11.3 seed node on the storage_port but
> quickly switch to the ssl_storage_port, but when connecting to the non-seed
> nodes it never switch to the ssl_storage_port.
> {noformat}
> >grep 193.246 system.log | grep Outbound
> 2018-10-25T10:57:36.799+0200 [MessagingService-NettyOutbound-Thread-4-1] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0x2f0e5e55] CONNECT:
> /10.216.193.246:12700
> 2018-10-25T10:57:36.902+0200 [MessagingService-NettyOutbound-Thread-4-2] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0x9e81f62c] CONNECT:
> /10.216.193.246:12701
> 2018-10-25T10:57:36.905+0200 [MessagingService-NettyOutbound-Thread-4-2] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0x9e81f62c,
> L:/10.216.193.242:37252 - R:10.216.193.246/10.216.193.246:12701] ACTIVE
> 2018-10-25T10:57:36.906+0200 [MessagingService-NettyOutbound-Thread-4-2] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0x9e81f62c,
> L:/10.216.193.242:37252 - R:10.216.193.246/10.216.193.246:12701] WRITE: 8B
> >grep 193.243 system.log | grep Outbound
> 2018-10-25T10:57:38.438+0200 [MessagingService-NettyOutbound-Thread-4-3] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0xd8f1d6c4] CONNECT:
> /10.216.193.243:12700
> 2018-10-25T10:57:38.540+0200 [MessagingService-NettyOutbound-Thread-4-4] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0xfde6cc9f] CONNECT:
> /10.216.193.243:12700
> 2018-10-25T10:57:38.694+0200 [MessagingService-NettyOutbound-Thread-4-5] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0x7e87fc4e] CONNECT:
> /10.216.193.243:12700
> 2018-10-25T10:57:38.741+0200 [MessagingService-NettyOutbound-Thread-4-7] INFO
> i.n.u.internal.logging.Slf4JLogger:101 info [id: 0x39395296] CONNECT:
> /10.216.193.243:12700{noformat}
>
> When I had the dbug log activated and started the 4.0 node I can see that it
> switch port for *.246 but not for *.243 and *.244.
> {noformat}
> >grep DEBUG system.log| grep OutboundMessagingConnection | grep
> >maybeUpdateConnectionId
> 2018-10-25T13:12:56.095+0200 [ScheduledFastTasks:1] DEBUG
> o.a.c.n.a.OutboundMessagingConnection:314 maybeUpdateConnectionId changing
> connectionId to 10.216.193.246:12701 (GOSSIP), with a different port for
> secure communication, because peer version is 11
> 2018-10-25T13:12:58.100+0200 [ReadStage-1] DEBUG
> o.a.c.n.a.OutboundMessagingConnection:314 maybeUpdateConnectionId changing
> connectionId to 10.216.193.246:12701 (SMALL_MESSAGE), with a different port
> for secure communication, because peer version is 11
> 2018-10-25T13:13:05.764+0200 [main] DEBUG
> o.a.c.n.a.OutboundMessagingConnection:314 maybeUpdateConnectionId changing
> connectionId to 10.216.193.246:12701 (LARGE_MESSAGE), with a different port
> for secure communication, because peer version is 11
> {noformat}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]