[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209591#comment-17209591
 ] 

Sylvain Lebresne commented on CASSANDRA-16063:
----------------------------------------------

bq. Before I start any implementation, I decided first to update the ticket and 
confirm the approach.

CASSANDRA-15897 is open to implement _exactly_ that problem. I suggest we 
commit the fix on this ticket as is and leave the issue of cluster-wide 
detection to CASSANDRA-15897. We did discuss options there some time ago, and 
kind of settled on Gossip-based at the time, so [~brandon.williams] is not 
going to be happy. I do have an almost ready branch for that Gossip approach 
btw (which, I won't deny, is a bit involved), and while I don't have time to 
get this to the finish line right now, I can share my branch (tomorrow most 
likely) and you can decide whether to use that or not.

bq. how do we handle nodes that are down?

Fwiw, my existing branch for CASSANDRA-15897 make nodes share the sstables 
version they have in use. If a node is down, other nodes simply rely on the 
last information they got from that node, which should work pretty well in 
practice.

> Fix user experience when upgrading to 4.0 with compact tables
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-16063
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/CQL
>            Reporter: Sylvain Lebresne
>            Assignee: Ekaterina Dimitrova
>            Priority: Normal
>             Fix For: 4.0-beta
>
>         Attachments: Compact_storage_upgrade_tests.txt
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to