[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428449#comment-13428449 ] Hudson commented on CASSANDRA-4436: --- Integrated in Cassandra #1861 (See [https://builds.apache.org/job/Cassandra/1861/]) Fix ScrubTest after file format change in CASSANDRA-4436 (Revision a075385d05c3e1d26475f448363958bad4645f17) Result = ABORTED yukim : Files : * test/data/corrupt-sstables/Keyspace1-Standard3-ia-1-Statistics.db Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4436-1.0-2.txt, 4436-1.0-2.txt, 4436-1.0.txt, 4436-1.1-2.txt, 4436-1.1-2.txt, 4436-1.1.txt, increments.cql.gz Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423110#comment-13423110 ] Jonathan Ellis commented on CASSANDRA-4436: --- +1 Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4436-1.0-2.txt, 4436-1.0-2.txt, 4436-1.0.txt, 4436-1.1-2.txt, 4436-1.1-2.txt, 4436-1.1.txt, increments.cql.gz Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422532#comment-13422532 ] Jonathan Ellis commented on CASSANDRA-4436: --- bq. But we won't have the same ancestor multiple times I don't think that's true. Suppose for instance we have leveled compaction with A and B in L0. They are larger than 5MB so we split the result into X, Y, and Z. Next we flush C to L0. It overlaps with Y and Z, so we're compacting C, Y, and Z. Now we have Y and Z both with A and B as ancestors. (Switching from LCS back to STCS is another way you could get duplicate ancestors.) Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4436-1.0-2.txt, 4436-1.0.txt, 4436-1.1-2.txt, 4436-1.1.txt, increments.cql.gz Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420930#comment-13420930 ] Jonathan Ellis commented on CASSANDRA-4436: --- Looks like skipCompacted in Directories.SSTableLister can be removed (since we scrubDataDirectories on startup and no new compacted components will be created). Using a List means we can add an ancestor multiple times. Suggest using a Set instead. Nits: - would prefer Ancestor to LiveAncestor, since we only check liveness at creation time, so Live is misleading when iterating over them later. - the deleting code feels more at home in CFS constructor than addInitialSSTables. - tracker parameter is unused now in SSTR.open Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4436-1.0.txt, 4436-1.1.txt, increments.cql.gz Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419031#comment-13419031 ] Peter Velas commented on CASSANDRA-4436: Thanks for your interest and time to fix it. We currently move to 1.1.2 version to avoid some random aws failure and patiently waiting for 1.1.3 release. Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4436-1.0.txt, 4436-1.1.txt, increments.cql.gz Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416080#comment-13416080 ] Sylvain Lebresne commented on CASSANDRA-4436: - The only difference I could see with the test I ran previously was the use of compression. So while I strongly doubt compression can have anything to do with that in any way, I rerun the test against 1.0 a bunch of time but I was still not able to reproduce any error. Since you seem to be able to reproduce easily, would you mind sharing the scripts you use to reproduce? I.e. mainly the code you use for insertion, preferably in plain thrift or CQL2 as this would eliminate the possibility of a client library bug. Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416298#comment-13416298 ] Peter Velas commented on CASSANDRA-4436: You are right its not affected by compression. I was just curious if its problem with our python code using pycassa ... So I created increments.cql containing 100k lines with 1000 increments for each of 100 key values. {code} cassandra-cli -h $HOSTNAME -p 9160 -f increments.cql -B /dev/null {code} after 3 rolling restarts each value was correct with value 3000 after 4 rolling restart values are incorrect see bellow {code} col15479 col10 5507 col100 5531 col11 5480 col12 5501 col13 5499 col14 5516 {code} Its 2 node cluster with replication=2. {code} [root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/cassandra-cli -h $HOSTNAME -p 9160 -f increments.cql -B /dev/null [root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME drain [root@cass-bug2 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 10.20.30.160datacenter1 rack1 Down Normal 97.67 KB50.00% 0 10.20.30.161datacenter1 rack1 Up Normal 113.45 KB 50.00% 85070591730234615865843651857942052864 [root@cass-bug1 ~]# killall java [root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/cassandra [root@cass-bug2 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME drain [root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 10.20.30.160datacenter1 rack1 Up Normal 97.67 KB50.00% 0 10.20.30.161datacenter1 rack1 Down Normal 86.13 KB50.00% 85070591730234615865843651857942052864 [root@cass-bug2 ~]# killall java [root@cass-bug2 ~]# /opt/apache-cassandra-1.0.10/bin/cassandra {code} Here is dump of keyspace and CF {code} create keyspace inc_test with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 2} and durable_writes = true; use inc_test; create column family cf1_increment with column_type = 'Standard' and comparator = 'BytesType' and default_validation_class = 'CounterColumnType' and key_validation_class = 'BytesType' and rows_cached = 0.0 and row_cache_save_period = 0 and row_cache_keys_to_save = 2147483647 and keys_cached = 20.0 and key_cache_save_period = 14400 and read_repair_chance = 1.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and row_cache_provider = 'SerializingCacheProvider' and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'; {code} Hope that helps you reproduce .. Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Attachments: increments.cql.gz Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414315#comment-13414315 ] Peter Velas commented on CASSANDRA-4436: create keyspace test_old with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 2} and durable_writes = true; use test_old; create column family cf1_increment with column_type = 'Standard' and comparator = 'BytesType' and default_validation_class = 'CounterColumnType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; In version 1.0.10 am always able to reproduce with this steps.. but its not reproducible in 1.1.2 . When I stop writing and shutdown node with nodetool drain there are some small commitlog files, but I don't bother to delete them just restart cassandra process. Maybe this is case ? Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart
[ https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413881#comment-13413881 ] Sylvain Lebresne commented on CASSANDRA-4436: - Can you reproduce every time with those steps? I tried reproducing with those exact steps (as far as I can tell) a few times on both 1.0 and 1.1 (the counter code didn't change much between 1.0 and 1.1) and wasn't able to reproduce. Counters in columns don't preserve correct values after cluster restart --- Key: CASSANDRA-4436 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.10 Reporter: Peter Velas Similar to #3821. but affecting normal columns. Set up a 2-node cluster with rf=2. 1. Create a counter column family and increment a 100 keys in loop 5000 times. 2. Then make a rolling restart to cluster. 3. Again increment another 5000 times. 4. Make a rolling restart to cluster. 5. Again increment another 5000 times. 6. Make a rolling restart to cluster. After step 6 we were able to reproduce bug with bad counter values. Expected values were 15 000. Values returned from cluster are higher then 15000 + some random number. Rolling restarts are done with nodetool drain. Always waiting until second node discover its down then kill java process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira