[ 
https://issues.apache.org/jira/browse/CASSANDRA-19948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17884907#comment-17884907
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19948 at 9/26/24 9:29 AM:
------------------------------------------------------------------------

I agree with Bowen here that regardless of how cdc is set up, CQL-wise, it 
should be all transparent and if we do not want to run cdc on a respective node 
then the config in yaml would just prevent that from happening, even cdc in ddl 
is true.

We already use this pattern, for example for autosnapshots we have this in 
ColumnFamily:

{code}
    public boolean isAutoSnapshotEnabled()
    {
        return metadata().params.allowAutoSnapshot && 
DatabaseDescriptor.isAutoSnapshot();
    }
{code}

So both the value in the descriptor as well as the value in TableParams (value 
from CQL) needs to be set to true. So if somebody sets it to true in CQL, when 
some node has auto_snapshot as false in their cassandra.yaml, then no 
autosnapshots are taken.

I do not get why we can not make it like that for cdc too.

[~bernardo.botella] I do not think that is what we want. We should be just fine 
with setting it to whatever. But we need to check the descriptor too if it is 
indeed enabled. That way we do not diverge schema etc. and we just do it as we 
do it elsewhere already. 

Bowen's point about enabling CDC on one node and not on another seems to be a 
valid use case and I do not think that we want to fail in case it is not 
enabled on some node.


was (Author: smiklosovic):
I agree with Bowen here that regardless of how cdc is set up, CQL-wise, it 
should be all transparent and if we do not want to run cdc on a respective node 
then the config in yaml would just prevent that from happening, even cdc in ddl 
is true.

We already use this pattern, for example for autosnapshots we have this in 
ColumnFamily:

{code}
    public boolean isAutoSnapshotEnabled()
    {
        return metadata().params.allowAutoSnapshot && 
DatabaseDescriptor.isAutoSnapshot();
    }
{code}

So both the value in the descriptor as well as the value in TableParams (value 
from CQL) needs to be set to true. So if somebody sets it to true in CQL, when 
some node has auto_snapshot as false in their cassandra.yaml, then no 
autosnapshots are taken.

I do not get why we can not make it like that for cdc too.

[~bernardo.botella] I do not think that is what we want. We should be just fine 
with setting it to whatever. But we need to check the descriptor too if it is 
indeed enabled. That way we do not diverge schema etc. and we just do it as we 
do it elsewhere already. 

Bowen's point about enabling CDC on one node for not on the other seems to be a 
valid use case and I do not think that we want to fail in case it is not 
enabled on some node.

> Changing cdc table property can cause schema disagreement
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-19948
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19948
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Schema
>            Reporter: Bowen Song
>            Priority: Normal
>             Fix For: 4.1.x, 5.0.x, 5.x
>
>         Attachments: 4.1.1.txt, 4.1.6.txt, 5.0.0-corrected.txt, 
> cdc_schema_disagreement.sh
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the cassandra.yaml file, there is a parameter named "cdc_enabled" which 
> allows CDC to be enabled or disabled on each individual nodes.
> It has been found that it can cause schema disagreement or discrepancy when 
> an "ALTER TABLE ... WITH cdc=..." statement is ran against a node which has 
> "cdc_enabled" set to "false" in a cluster in which nodes have mixed 
> "true"/"false" values for the "cdc_enabled" settings.
> The exact behaviour of the above is version-dependant.
> On Cassandra 4.1.1, the cluster will end up in the schema disagreement state. 
> A rolling restart will bring the schema back in sync, but the changes made to 
> the `cdc` table property will be lost. 
> On Cassandra 4.1.6, the cluster will not have visible schema disagreement in 
> the "nodetool describecluster" command's output, but the "ALTER TABLE" 
> statement only has cosmetic effect on the node it is run. The node with 
> "cdc_enabled" set to "false" will show the "cdc" table property has changed, 
> but this does not affect its behaviour in any way. At the same time, other 
> nodes do not see that table property change at all. This is perhaps even 
> worse than on 4.1.1, because the alter table statement is silently failing. 
> On Casandra 5.0.0, the behaviour is the same as 4.1.6.
> A shell script for reproducing the above described behaviours in Docker, and 
> the outputs of it on both 4.1.1 and 4.1.6 and 5.0.0 are attached.
>  
> Edit on 25 Sep: added test result on 5.0.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to