[ 
https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739427#comment-16739427
 ] 

Avraham Kalvo edited comment on CASSANDRA-14957 at 1/10/19 2:09 PM:
--------------------------------------------------------------------

To be clear, here's the timeline of the incident:

```
12:05:02 first node state jump to shutdown for restart
12:06:37 INFO  Initializing tasks_scheduler_external.tasks (first node)
12:06:39 WARN UnknownColumnFamilyException reading from socket; closing (first 
node)
...
12:09:15 only trace of service migration running by issuing the following:
`CREATE KEYSPACE IF NOT EXISTS tasks_scheduler_external WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': '3'};`
...
12:09:31 last node started after restart
```
Notice *no tables* were attempted to be created throughout the restart, and 
also the keyspace wasn't recreated as it was already in existence.

Hence - the new version of the table, as visible in the file system, *has 
nothing to do* with any explicit DDL running before, throughout and after the 
rolling restart.
The schema (DDL) hasn't changed - just the data was split into a new version on 
the filesystem which eventually became the version the cluster agrees on once 
it has completed its rolling restart.

Thank you.
Avi.


was (Author: via.vokal):
To be clear, here's the timeline of the incident:

12:05:02 first node state jump to shutdown for restart
12:06:37 INFO  Initializing tasks_scheduler_external.tasks (first node)
12:06:39 WARN UnknownColumnFamilyException reading from socket; closing (first 
node)
...
12:09:15 only trace of service migration running by issuing the following:
`CREATE KEYSPACE IF NOT EXISTS tasks_scheduler_external WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': '3'};`
...
12:09:31 last node started after restart

Notice *no tables* were attempted to be created throughout the restart, and 
also the keyspace wasn't recreated as it was already in existence.

Hence - the new version of the table, as visible in the file system, *has 
nothing to do* with any explicit DDL running before, throughout and after the 
rolling restart.
The schema (DDL) hasn't changed - just the data was split into a new version on 
the filesystem which eventually became the version the cluster agrees on once 
it has completed its rolling restart.

Thank you.
Avi.

> Rolling Restart Of Nodes Causes Dataloss Due To Schema Collision
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-14957
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14957
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Schema
>            Reporter: Avraham Kalvo
>            Priority: Major
>
> We were issuing a rolling restart on a mission-critical five node C* cluster.
> The first node which was restarted got the following messages in its 
> system.log:
> ```
> January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing 
> tasks_scheduler_external.tasks
> ```
> ```
> WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
> cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this 
> is likely due to the schema not being fully propagated. Please wait for 
> schema agreement on table creation.
> at 
> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> ```
> The latter was then repeated several times across the cluster.
> It was then found out that the table in question 
> `tasks_scheduler_external.tasks` was created with a new schema version 
> sometime along the entire cluster consecutive restart and became available 
> once the schema agreement settled, which started taking requests leaving the 
> previous version of the schema unavailable for any request, thus generating a 
> data loss to our online system.
> Data loss was recovered by manually copying SSTables from the previous 
> version directory of the schema to the new one followed by `nodetool refresh` 
> to the relevant table.
> The above has repeated itself for several tables across various keyspaces.
> One other thing to mention is that a repair was in place for the first node 
> to be restarted, which was obviously stopped as the daemon was shut down, but 
> this doesn't seem to do with the above at first glance.
> Seems somewhat related to:
> https://issues.apache.org/jira/browse/CASSANDRA-13559



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to