[ https://issues.apache.org/jira/browse/CASSANDRA-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joshua McKenzie updated CASSANDRA-13125: ---------------------------------------- Assignee: Yasuharu Goto > Duplicate rows after upgrading from 2.1.16 to 3.0.10/3.9 > -------------------------------------------------------- > > Key: CASSANDRA-13125 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13125 > Project: Cassandra > Issue Type: Bug > Reporter: Zhongxiang Zheng > Assignee: Yasuharu Goto > Attachments: diff-a.patch, diff-b.patch > > > I found that rows are splitting and duplicated after upgrading the cluster > from 2.1.x to 3.0.x. > I found the way to reproduce the problem as below. > {code} > $ ccm create test -v 2.1.16 -n 3 -s > > Current cluster is now: test > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.test (id text PRIMARY KEY, value1 > set<text>, value2 set<text>);" > # Upgrade node1 > $ for i in 1; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # Insert a row through node1(3.0.10) > $ ccm node1 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # Insert a row through node2(2.1.16) > $ ccm node2 cqlsh -e "INSERT INTO test.test (id, value1, value2) values > ('bbb', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > # The row inserted from node1 is splitting > $ ccm node1 cqlsh -e "SELECT * FROM test.test ;" > id | value1 | value2 > -----+----------------+---------------- > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > $ for i in 1 2; do ccm node${i} nodetool flush; done > # Results of sstable2json of node2. The row inserted from node1(3.0.10) is > different from the row inserted from node2(2.1.16). > $ ccm node2 json -k test -c test > running > ['/home/zzheng/.ccm/test/node2/data0/test/test-5406ee80dbdb11e6a175f57c4c7c85f3/test-test-ka-1-Data.db'] > -- test-test-ka-1-Data.db ----- > [ > {"key": "aaa", > "cells": [["","",1484564624769577], > ["value1","value2:!",1484564624769576,"t",1484564624], > ["value1:616161","",1484564624769577], > ["value1:626262","",1484564624769577], > ["value2:636363","",1484564624769577], > ["value2:646464","",1484564624769577]]}, > {"key": "bbb", > "cells": [["","",1484564634508029], > ["value1:_","value1:!",1484564634508028,"t",1484564634], > ["value1:616161","",1484564634508029], > ["value1:626262","",1484564634508029], > ["value2:_","value2:!",1484564634508028,"t",1484564634], > ["value2:636363","",1484564634508029], > ["value2:646464","",1484564634508029]]} > ] > # Upgrade node2,3 > $ for i in `seq 2 3`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > # After upgrade node2,3, the row inserted from node1 is splitting in node2,3 > $ ccm node2 cqlsh -e "SELECT * FROM test.test ;" > > id | value1 | value2 > -----+----------------+---------------- > aaa | null | null > aaa | {'aaa', 'bbb'} | {'ccc', 'ddd'} > bbb | {'aaa', 'bbb'} | {'ccc', 'ddd'} > (3 rows) > # Results of sstabledump > # node1 > [ > { > "partition" : { > "key" : [ "aaa" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 17, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:44.769576Z", "local_delete_time" : "2017-01-16T11:03:44Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > }, > { > "partition" : { > "key" : [ "bbb" ], > "position" : 48 > }, > "rows" : [ > { > "type" : "row", > "position" : 65, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > } > ] > > # node2 > [ > { > "partition" : { > "key" : [ "aaa" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 17, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:44.769577Z" }, > "cells" : [ ] > }, > { > "type" : "row", > "position" : 22, > "deletion_info" : { "marked_deleted" : "2017-01-16T11:03:44.769576Z", > "local_delete_time" : "2017-01-16T11:03:44Z" }, > "cells" : [ > { "name" : "value1", "path" : [ "aaa" ], "value" : "", "tstamp" : > "2017-01-16T11:03:44.769577Z" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "", "tstamp" : > "2017-01-16T11:03:44.769577Z" }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "", "tstamp" : > "2017-01-16T11:03:44.769577Z" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "", "tstamp" : > "2017-01-16T11:03:44.769577Z" } > ] > } > ] > }, > { > "partition" : { > "key" : [ "bbb" ], > "position" : 57 > }, > "rows" : [ > { > "type" : "row", > "position" : 74, > "liveness_info" : { "tstamp" : "2017-01-16T11:03:54.508029Z" }, > "cells" : [ > { "name" : "value1", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } > }, > { "name" : "value1", "path" : [ "aaa" ], "value" : "" }, > { "name" : "value1", "path" : [ "bbb" ], "value" : "" }, > { "name" : "value2", "deletion_info" : { "marked_deleted" : > "2017-01-16T11:03:54.508028Z", "local_delete_time" : "2017-01-16T11:03:54Z" } > }, > { "name" : "value2", "path" : [ "ccc" ], "value" : "" }, > { "name" : "value2", "path" : [ "ddd" ], "value" : "" } > ] > } > ] > } > ] > {code} > Another example of row splitting is as follows. > {code} > $ ccm create test2 -v 2.1.16 -n 3 -s > > Current cluster is now: test2 > $ ccm node1 cqlsh -e "CREATE KEYSPACE test WITH replication = > {'class':'SimpleStrategy', 'replication_factor':3}" > $ ccm node1 cqlsh -e "CREATE TABLE test.text_set_set (id text PRIMARY KEY, > value1 text, value2 set<text>, value3 set<text>);" > $ for i in `seq 1`; do ccm node${i} stop; ccm node${i} setdir -v3.0.10; ccm > node${i} start;ccm node${i} nodetool upgradesstables; done > $ ccm node1 cqlsh -e "INSERT INTO test.text_set_set (id, value1, value2, > value3) values ('aaa', 'aaa', {'aaa', 'bbb'}, {'ccc', 'ddd'});" > $ ccm node1 cqlsh -e "SELECT * FROM test.text_set_set;" > > id | value1 | value2 | value3 > -----+--------+----------------+---------------- > aaa | aaa | null | null > aaa | null | {'aaa', 'bbb'} | {'ccc', 'ddd'} > (2 rows) > {code} > As far as I investigated, the occurrence conditions are as follows. > * Table schema contains multiple collections. > * Insert a row, which values of the collection column are not null through > 3.x node while both 2.1 and 3.x nodes exist in a cluster. > * Rows in sstables of node which version was 2.1 at the time the row was > inserted is splitting after upgrading to 3.x. > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)