[ https://issues.apache.org/jira/browse/CASSANDRA-12236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384645#comment-15384645 ]
Joshua McKenzie commented on CASSANDRA-12236: --------------------------------------------- Two ways to approach this have come up in offline discussions. The first and less invasive method would be to suppress sending schema information about the cdc param status on versions >= 3.8 unless cdc_enabled:true is set in the cassandra.yaml. Specifically, removing the addition of the cdc param [here|https://github.com/apache/cassandra/blob/cassandra-3.8/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L514] and instead adding it conditionally [here|https://github.com/apache/cassandra/blob/cassandra-3.8/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L476]. The upgrade path for users that want to enable cdc would be: don't enable CDC until your entire cluster is updated to a version >= 3.8. Then enable it and bounce your cluster. Since schema information is hard-coded in SchemaKeyspace.java and we don't actually care about the value in that param if cdc is not enabled on the cluster, it seems a reasonable workaround until we get to versioned sub-systems in Cassandra. The second and more invasive method (at least from the perspective of the # of versions it touches and potential side-effects) would be to allow null columns during deserialization in [Columns.java|https://github.com/apache/cassandra/blob/cassandra-3.8/src/java/org/apache/cassandra/db/Columns.java#L433] if the mutation is for a schema table. This would apply to 3.0.x and 3.8+. This would get us back to a functionality somewhat similar to pre-CASSANDRA-8099, in that mutations for schema tables on different versions would no longer interrupt inter-node communication during an upgrade process via RTE. I'm by no means an expert on schema dissemination - [~iamaleksey] / [~slebresne]: either of you have any feedback on the above two or other, better ideas on this front? > RTE from new CDC column breaks in flight queries. > ------------------------------------------------- > > Key: CASSANDRA-12236 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12236 > Project: Cassandra > Issue Type: Bug > Reporter: Jeremiah Jordan > Priority: Blocker > Fix For: 3.8 > > > This RTE is not harmless. It will cause the internode connection to break > which will cause all in flight requests between these nodes to die/timeout. > {noformat} > - Due to changes in schema migration handling and the storage format > after 3.0, you will > see error messages such as: > "java.lang.RuntimeException: Unknown column cdc during > deserialization" > in your system logs on a mixed-version cluster during upgrades. This > error message > is harmless and due to the 3.8 nodes having cdc added to their schema > tables while > the <3.8 nodes do not. This message should cease once all nodes are > upgraded to 3.8. > As always, refrain from schema changes during cluster upgrades. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)