[ 
https://issues.apache.org/jira/browse/CASSANDRA-12857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15621666#comment-15621666
 ] 

Sylvain Lebresne commented on CASSANDRA-12857:
----------------------------------------------

Something went wrong when upgrading the schema, but it will be hard to track it 
down unless we have your schema (or at least some version of your schema that 
reproduce).

> Upgrade procedure between 2.1.x and 3.0.x is broken
> ---------------------------------------------------
>
>                 Key: CASSANDRA-12857
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12857
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alexander Yasnogor
>            Priority: Critical
>
> It is not possible safely to do Cassandra in place upgrade from 2.1.14 to 
> 3.0.9.
> Distribution: deb packages from datastax community repo.
> The upgrade was performed according to procedure from this docu: 
> https://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgrdCassandraDetails.html
> Potential reason: The upgrade procedure creates corrupted system_schema and 
> this keyspace get populated in the cluster and kills it.
> We started with one datacenter which contains 19 nodes divided to two racks.
> First rack was successfully upgraded and nodetool describecluster reported 
> two schema versions. One for upgraded nodes, another for non-upgraded nodes.
> On starting new version on a first node from the second rack:
> {code:java}
> INFO  [main] 2016-10-25 13:06:12,103 LegacySchemaMigrator.java:87 - Moving 11 
> keyspaces from legacy schema tables to the new schema keyspace (system_schema)
> INFO  [main] 2016-10-25 13:06:12,104 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@7505e6ac
> INFO  [main] 2016-10-25 13:06:12,200 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@64414574
> INFO  [main] 2016-10-25 13:06:12,204 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@3f2c5f45
> INFO  [main] 2016-10-25 13:06:12,207 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@2bc2d64d
> INFO  [main] 2016-10-25 13:06:12,301 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@77343846
> INFO  [main] 2016-10-25 13:06:12,305 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@19b0b931
> INFO  [main] 2016-10-25 13:06:12,308 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@44bb0b35
> INFO  [main] 2016-10-25 13:06:12,311 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@79f6cd51
> INFO  [main] 2016-10-25 13:06:12,319 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@2fcd363b
> INFO  [main] 2016-10-25 13:06:12,356 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@609eead6
> INFO  [main] 2016-10-25 13:06:12,358 LegacySchemaMigrator.java:148 - 
> Migrating keyspace 
> org.apache.cassandra.schema.LegacySchemaMigrator$Keyspace@7eb7f5d0
> INFO  [main] 2016-10-25 13:06:13,958 LegacySchemaMigrator.java:97 - 
> Truncating legacy schema tables
> INFO  [main] 2016-10-25 13:06:26,474 LegacySchemaMigrator.java:103 - 
> Completed migration of legacy schema tables
> INFO  [main] 2016-10-25 13:06:26,474 StorageService.java:521 - Populating 
> token metadata from system tables
> INFO  [main] 2016-10-25 13:06:26,796 StorageService.java:528 - Token 
> metadata: Normal Tokens: [HUGE LIST of tokens]
> INFO  [main] 2016-10-25 13:06:29,066 ColumnFamilyStore.java:389 - 
> Initializing ...
> INFO  [main] 2016-10-25 13:06:29,066 ColumnFamilyStore.java:389 - 
> Initializing ...
> INFO  [main] 2016-10-25 13:06:45,894 AutoSavingCache.java:165 - Completed 
> loading (2 ms; 460 keys) KeyCache cache
> INFO  [main] 2016-10-25 13:06:46,982 StorageService.java:521 - Populating 
> token metadata from system tables
> INFO  [main] 2016-10-25 13:06:47,394 StorageService.java:528 - Token 
> metadata: Normal Tokens:[HUGE LIST of tokens]
> INFO  [main] 2016-10-25 13:06:47,420 LegacyHintsMigrator.java:88 - Migrating 
> legacy hints to new storage
> INFO  [main] 2016-10-25 13:06:47,420 LegacyHintsMigrator.java:91 - Forcing a 
> major compaction of system.hints table
> INFO  [main] 2016-10-25 13:06:50,587 LegacyHintsMigrator.java:95 - Writing 
> legacy hints to the new storage
> INFO  [main] 2016-10-25 13:06:53,927 LegacyHintsMigrator.java:99 - Truncating 
> system.hints table
> ....
> INFO  [main] 2016-10-25 13:06:56,572 MigrationManager.java:342 - Create new 
> table: 
> org.apache.cassandra.config.CFMetaData@242e5306[cfId=c5e99f16-8677-3914-b17e-960613512345,ksName=system_traces,cfName=sessions,flags=[COMPOUND],params=TableParams{comment=tracing
>  sessions, read_repair_chance=0.0, dclocal_read_repair_chance=0.0, 
> bloom_filter_fp_chance=0.01, crc_check_chance=1.0, gc_grace_seconds=0, 
> default_time_to_live=0, memtable_flush_period_in_ms=3600000, 
> min_index_interval=128, max_index_interval=2048, 
> speculative_retry=99PERCENTILE, caching={'keys' : 'ALL', 'rows_per_partition' 
> : 'NONE'}, 
> compaction=CompactionParams{class=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,
>  options={min_threshold=4, max_threshold=32}}, 
> compression=org.apache.cassandra.schema.CompressionParams@3fa913a4, 
> extensions={}},comparator=comparator(),partitionColumns=[[] | [client command 
> coordinator duration request started_at 
> parameters]],partitionKeyColumns=[ColumnDefinition{name=session_id, 
> type=org.apache.cassandra.db.marshal.UUIDType, kind=PARTITION_KEY, 
> position=0}],clusteringColumns=[],keyValidator=org.apache.cassandra.db.marshal.UUIDType,columnMetadata=[ColumnDefinition{name=client,
>  type=org.apache.cassandra.db.marshal.InetAddressType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=command, 
> type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=session_id, 
> type=org.apache.cassandra.db.marshal.UUIDType, kind=PARTITION_KEY, 
> position=0}, ColumnDefinition{name=coordinator, 
> type=org.apache.cassandra.db.marshal.InetAddressType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=request, 
> type=org.apache.cassandra.db.marshal.UTF8Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=started_at, 
> type=org.apache.cassandra.db.marshal.TimestampType, kind=REGULAR, 
> position=-1}, ColumnDefinition{name=duration, 
> type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, position=-1}, 
> ColumnDefinition{name=parameters, 
> type=org.apache.cassandra.db.marshal.MapType(org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type),
>  kind=REGULAR, position=-1}],droppedColumns={},triggers=[],indexes=[]]
> INFO  [GossipStage:1] 2016-10-25 13:06:57,121 StorageService.java:1969 - Node 
> /10.41.100.31 state jump to NORMAL
> INFO  [GossipStage:1] 2016-10-25 13:06:57,127 TokenMetadata.java:479 - 
> Updating topology for /10.41.100.31
> INFO  [GossipStage:1] 2016-10-25 13:06:57,127 TokenMetadata.java:479 - 
> Updating topology for /10.41.100.31
> INFO  [HANDSHAKE-/10.11.100.19] 2016-10-25 13:06:57,128 
> OutboundTcpConnection.java:515 - Handshaking version with /10.11.100.19
> .....
> INFO  [main] 2016-10-25 13:07:02,773 MigrationManager.java:342 - Create new 
> table: ……………
> INFO  [main] 2016-10-25 13:07:04,136 MigrationManager.java:302 - Create new 
> Keyspace: KeyspaceMetadata
> {code}
> But then all upgraded nodes reported many times the same error
> {code:java}
> ERROR [InternalResponseStage:12] 2016-10-25 13:07:26,891 
> MigrationTask.java:96 - Configuration exception merging remote schema 
> org.apache.cassandra.exceptions.ConfigurationException: Column family 
> comparators do not match or are not compatible (found 
> comparator(org.apache.cassandra.db.marshal.UTF8Type, org.apac......
>         at 
> org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:787)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:740) 
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at org.apache.cassandra.config.Schema.updateTable(Schema.java:661) 
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1346)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1302)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1252)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:92) 
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
>  [apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> [apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_101]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_101]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_101]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_101]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
> {code}
> nodetool describecluster reported 4 different schema versions
>       1. All nodes on old version
>       2. 7 migrated nodes from the first rack
>       3. 2 migrated nodes from the first rack
>       4. 1 node from the second rack
>       
> Meanwhile the cluster was fully responsible for reads and writes.
> Anyway the migration was stopped at this point and further investigations 
> showed that there are corrupted records in system_schema.tables, 
> system_schema.columns contained duplicated broken records with \x00 instead 
> of letters.
> {code:java}
> dc1_tenant_ssd |                         \x00\x00\x00\x00\x00\x00 |           
>        0.001 | {'keys': 'ALL', 'rows_per_partition': 'ALL'} |                
> UP (on SSD) | {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} | 
> {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'} |                1 |        
>                   0 |                    0 |           {} | {'compound'} |    
>        864000 | 0ae08450-80b9-11e6-8bf1-0df6cc57511a |               2048 |   
>                         0 |                128 |                  0 |      
> 99PERCENTILE
> dc1_tenant_ssd | \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 |           
>        0.001 | {'keys': 'ALL', 'rows_per_partition': 'ALL'} | UP (old CF 
> format, on SSD) | {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} | 
> {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'} |                1 |        
>                   0 |                    0 |           {} |    {'dense'} |    
>        864000 | 16c420b0-78fd-11e6-ae98-ff8f609f3a2d |               2048 |   
>                         0 |                128 |                  0 |      
> 99PERCENTILE
> dc1_tenant_ssd |                         \x00\x00\x00\x00\x00\x00 |           
>        0.001 | {'keys': 'ALL', 'rows_per_partition': 'ALL'} |                
> UT (on SSD) | {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} | 
> {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'} |                1 |        
>                   0 |                    0 |           {} |    {'dense'} |    
>        864000 | c38bce70-78fc-11e6-ae98-ff8f609f3a2d |               2048 |   
>                         0 |                128 |                  0 |      
> 99PERCENTILE
> dc1_tenant_ssd |                                           user_p |           
>        0.001 | {'keys': 'ALL', 'rows_per_partition': 'ALL'} |                
> UP (on SSD) | {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} | 
> {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'} |                1 |        
>                   0 |                    0 |           {} | {'compound'} |    
>        864000 | 0ae08450-80b9-11e6-8bf1-0df6cc57511a |               2048 |   
>                         0 |                128 |                  0 |      
> 99PERCENTILE
> dc1_tenant_ssd |                                     user_p_oldcf |           
>        0.001 | {'keys': 'ALL', 'rows_per_partition': 'ALL'} | UP (old CF 
> format, on SSD) | {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} | 
> {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'} |                1 |        
>                   0 |                    0 |           {} |    {'dense'} |    
>        864000 | 16c420b0-78fd-11e6-ae98-ff8f609f3a2d |               2048 |   
>                         0 |                128 |                  0 |      
> 99PERCENTILE
> dc1_tenant_ssd |                                           user_t |           
>        0.001 | {'keys': 'ALL', 'rows_per_partition': 'ALL'} |                
> UT (on SSD) | {'class': 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} | 
> {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'} |                1 |        
>                   0 |                    0 |           {} |    {'dense'} |    
>        864000 | c38bce70-78fc-11e6-ae98-ff8f609f3a2d |               2048 |   
>                         0 |                128 |                  0 |      
> 99PERCENTILE
> {code}
> it is clear that system_schema was corrupted on every node based on 
> sstabledump output.
> The strange thing is that before upgrading the whole cluster, one single node 
> was upgraded one day before and system_schema was OK before to roll out the 
> upgrade on other nodes. It was particularly checked.
> later the upgraded nodes refused to restart due to the duplicates in 
> system_schema.tables with an exception:
> {code:java}
> java.lang.IllegalStateException: One row required, 2 found
>         at 
> org.apache.cassandra.cql3.UntypedResultSet$FromResultSet.one(UntypedResultSet.java:84)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:938)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:928)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:891)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:868)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:856)
>  ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:136) 
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:126) 
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:239) 
> [apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:568)
>  [apache-cassandra-3.0.9.jar:3.0.9]
>         at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:696) 
> [apache-cassandra-3.0.9.jar:3.0.9]
> {code}
> I am quite confident that this is not a hardware problem, so far tried to 
> perform the upgrade twice with the same results.
> Yes, this very unfortunate migration scenario didn't affect only one node and 
> brought the cluster to unusable state where there were no ways back. So far 
> decommission didn't work between different versions and scrub removed all 
> data from tables in system_schema.
> We ended up by exporting the data and removing upgraded nodes from the 
> cluster with its data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to