[jira] [Commented] (CASSANDRA-15298) Cassandra node cannot be restored using documented backup method

Charlemange Lasse (Jira) Tue, 03 Sep 2019 11:54:13 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921661#comment-16921661
 ]


Charlemange Lasse commented on CASSANDRA-15298:
-----------------------------------------------

How should sstablesscrub know about deleted columns? It doesn't have the schema 
(and no information what is dropped and what is not), right?

And doing a "nodetool scrub" before making the snapshot doesn't remove the 
columns. Which means that the restore will also not work with the snapshot on 
the other node.

 

And I am not the only person running in this problem and so I would say that 
the documentation is misleading. [~cassio rossi], [~jjordan], ...

> Cassandra node cannot be restored using documented backup method
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-15298
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15298
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Charlemange Lasse
>            Priority: Normal
>
> I have a single cassandra 3.11.4 node. It contains various tables and UDFs. 
> The [documentation describes a method to backup this 
> node|https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/operations/opsBackupTakesSnapshot.html]:
>  * use "DESCRIBE SCHEMA" in cqlsh to get the schema
>  * create a snapshot using nodetool
>  * copy the snapshot + schema to a new (completely disconnected) node
>  * load schema into new node
>  * load sstables again using nodetool
> But this is a complete bogus method. It will result in errors like: 
>  
> {noformat}
> java.lang.RuntimeException: Unknown column deleted_column during 
> deserialization {noformat}
> And all data in this column is now lost.
> Problem is that the "DESCRIBE SCHEMA" CQL doesn't add the stuff correctly for 
> already deleted (but still existing columns) to the schema. It looks for 
> example like:
> {noformat}
> CREATE TABLE mykeyspace.testcf (
>     primary_uuid uuid,
>     secondary_uuid uuid,
>     name text,
>     PRIMARY KEY (main_uuid, secondary_uuid)
> ) WITH CLUSTERING ORDER BY (secondary_uuid ASC)
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>     AND comment = ''
>     AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
>     AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99PERCENTILE';
> {noformat}
> But it must actually look like:
> {noformat}
> CREATE TABLE IF NOT EXISTS mykeyspace.testcf (
>         primary_uuid uuid,
>         secondary_uuid uuid,
>         name text,
>         deleted_column boolean,
>         PRIMARY KEY (main_uuid, secondary_uuid)
>         WITH ID = a1afdd4d-b61e-4f2a-b806-57c296be3948
>         AND CLUSTERING ORDER BY (ap_uuid ASC)
>         AND bloom_filter_fp_chance = 0.01
>         AND dclocal_read_repair_chance = 0.1
>         AND crc_check_chance = 1.0
>         AND default_time_to_live = 0
>         AND gc_grace_seconds = 864000
>         AND min_index_interval = 128
>         AND max_index_interval = 2048
>         AND memtable_flush_period_in_ms = 0
>         AND read_repair_chance = 0.0
>         AND speculative_retry = '99PERCENTILE'
>         AND comment = ''
>         AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
>         AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 
> 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
>         AND compression = { 'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor' }
>         AND cdc = false
>         AND extensions = {  };
> ALTER TABLE mykeyspace.testcf DROP deleted_column USING TIMESTAMP 
> 1563978151561000;
> {noformat}
> This was taken from the snapshot's (column family specific) schema.cql. Which 
> of course is not compatible with the main schema because it will only create 
> the tables when they don't exist (which they are because the main "DESCRIBE 
> SCHEMA" file already creates them) and is missing all other kind of stuff 
> like UDFs.
> It is currently not possible (using the builtin mechanisms from cassandra 
> 3.11.4) to migrate a keyspace from one separated server to another separated 
> server.
> This behavior also breaks various backup systems which try to store cassandra 
> cluster information to an offline storage.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15298) Cassandra node cannot be restored using documented backup method

Reply via email to