[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201519#comment-17201519 ]
Ekaterina Dimitrova edited comment on CASSANDRA-16063 at 9/24/20, 1:28 PM: --------------------------------------------------------------------------- Detailed code review showed the error I was getting as not being related. Empty segments should be skipped on startup. I recreated my test environment. Managed to fix my issues with running the upgrade tests locally. No flag is needed. This is how the solution works now: These are the four branches I worked on for this patch: [C* 3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0] | [C* 3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11] | [trunk |https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063]| [DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063] 1) Check SSTables for latest version before dropping compact storage commits - [3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c] and [3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7] 2) Move compact storage [validation |https://github.com/ekaterinadimitrova2/cassandra/commit/1a8b3ea2823d8424e2018c686fb2d6e5d67270f7#diff-a5df240149285ae528cdd3c41aa59360R104] earlier in the startup process. 4) Two new upgrade tests created and an old one was fixed [here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063] Trunk CI run: [java 8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77] and [Java 11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77] CI runs did not show any new issues introduced. The two failing tests already have opened corresponding tickets: _test_tracing_does_not_interfere_with_digest_calculation - cql_tracing_test.TestCqlTracing - CASSANDRA-14157_ _testMessagePurging - org.apache.cassandra.net.ConnectionTest - CASSANDRA-15958_ Attached is the log of the upgrade tests successfully passing [~slebresne] do you have time to review it again? Or maybe [~adelapena] can help here? was (Author: e.dimitrova): Detailed code review showed the error I was getting as not being related. Empty segments should be skipped on startup. I recreated my test environment. Managed to fix my issues with running the upgrade tests locally. No flag is needed. This is how the solution works now: These are the four branches I worked on for this patch: [C* 3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0] | [C* 3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11] | [trunk |https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063]| [DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063] 1) Check SSTables for latest version before dropping compact storage commits - [3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c] and [3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7] 2) Move compact storage [validation |https://github.com/ekaterinadimitrova2/cassandra/commit/1a8b3ea2823d8424e2018c686fb2d6e5d67270f7#diff-a5df240149285ae528cdd3c41aa59360R104] is moved earlier in startup process. 4) Two new upgrade tests created and an old one was fixed [here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063] Trunk CI run: [java 8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77] and [Java 11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77] CI runs did not show any new issues introduced. The two failing tests already have opened corresponding tickets: _test_tracing_does_not_interfere_with_digest_calculation - cql_tracing_test.TestCqlTracing - CASSANDRA-14157_ _testMessagePurging - org.apache.cassandra.net.ConnectionTest - CASSANDRA-15958_ Attached is the log of the upgrade tests successfully passing [~slebresne] do you have time to review it again? Or maybe [~adelapena] can help here? > Fix user experience when upgrading to 4.0 with compact tables > ------------------------------------------------------------- > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL > Reporter: Sylvain Lebresne > Assignee: Ekaterina Dimitrova > Priority: Normal > Fix For: 4.0-beta > > Attachments: Compact_storage_upgrade_tests.txt > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org