[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428449#comment-13428449
 ] 

Hudson commented on CASSANDRA-4436:
---

Integrated in Cassandra #1861 (See 
[https://builds.apache.org/job/Cassandra/1861/])
Fix ScrubTest after file format change in CASSANDRA-4436 (Revision 
a075385d05c3e1d26475f448363958bad4645f17)

 Result = ABORTED
yukim : 
Files : 
* test/data/corrupt-sstables/Keyspace1-Standard3-ia-1-Statistics.db


 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas
Assignee: Sylvain Lebresne
 Fix For: 1.1.3

 Attachments: 4436-1.0-2.txt, 4436-1.0-2.txt, 4436-1.0.txt, 
 4436-1.1-2.txt, 4436-1.1-2.txt, 4436-1.1.txt, increments.cql.gz


 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423110#comment-13423110
 ] 

Jonathan Ellis commented on CASSANDRA-4436:
---

+1

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas
Assignee: Sylvain Lebresne
 Fix For: 1.1.3

 Attachments: 4436-1.0-2.txt, 4436-1.0-2.txt, 4436-1.0.txt, 
 4436-1.1-2.txt, 4436-1.1-2.txt, 4436-1.1.txt, increments.cql.gz


 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422532#comment-13422532
 ] 

Jonathan Ellis commented on CASSANDRA-4436:
---

bq. But we won't have the same ancestor multiple times

I don't think that's true.  Suppose for instance we have leveled compaction 
with A and B in L0.  They are larger than 5MB so we split the result into X, Y, 
and Z.  Next we flush C to L0.  It overlaps with Y and Z, so we're compacting 
C, Y, and Z.  Now we have Y and Z both with A and B as ancestors.

(Switching from LCS back to STCS is another way you could get duplicate 
ancestors.)

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas
Assignee: Sylvain Lebresne
 Fix For: 1.1.3

 Attachments: 4436-1.0-2.txt, 4436-1.0.txt, 4436-1.1-2.txt, 
 4436-1.1.txt, increments.cql.gz


 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420930#comment-13420930
 ] 

Jonathan Ellis commented on CASSANDRA-4436:
---

Looks like skipCompacted in Directories.SSTableLister can be removed (since we 
scrubDataDirectories on startup and no new compacted components will be 
created).

Using a List means we can add an ancestor multiple times.  Suggest using a Set 
instead.

Nits:
- would prefer Ancestor to LiveAncestor, since we only check liveness at 
creation time, so Live is misleading when iterating over them later.
- the deleting code feels more at home in CFS constructor than 
addInitialSSTables.
- tracker parameter is unused now in SSTR.open

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas
Assignee: Sylvain Lebresne
 Fix For: 1.1.3

 Attachments: 4436-1.0.txt, 4436-1.1.txt, increments.cql.gz


 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-20 Thread Peter Velas (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419031#comment-13419031
 ] 

Peter Velas commented on CASSANDRA-4436:


Thanks for your interest and time to fix it. We currently move to 1.1.2 version 
to avoid some random aws failure and patiently waiting for 1.1.3 release. 

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas
Assignee: Sylvain Lebresne
 Fix For: 1.1.3

 Attachments: 4436-1.0.txt, 4436-1.1.txt, increments.cql.gz


 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-17 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416080#comment-13416080
 ] 

Sylvain Lebresne commented on CASSANDRA-4436:
-

The only difference I could see with the test I ran previously was the use of 
compression. So while I strongly doubt compression can have anything to do with 
that in any way, I rerun the test against 1.0 a bunch of time but I was still 
not able to reproduce any error.

Since you seem to be able to reproduce easily, would you mind sharing the 
scripts you use to reproduce? I.e. mainly the code you use for insertion, 
preferably in plain thrift or CQL2 as this would eliminate the possibility of a 
client library bug.

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas

 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-17 Thread Peter Velas (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416298#comment-13416298
 ] 

Peter Velas commented on CASSANDRA-4436:


You are right its not affected by compression.
I was just curious if its problem with our python code using pycassa ... 
So I created increments.cql containing 100k lines with 1000 increments for each 
of 100 key values.
{code}
cassandra-cli -h $HOSTNAME -p 9160 -f increments.cql -B /dev/null 
{code}

after 3 rolling restarts each value was correct with value 3000 
after 4 rolling restart values are incorrect see bellow

{code}
col15479
col10   5507
col100  5531
col11   5480
col12   5501
col13   5499
col14   5516
{code}

Its 2 node cluster with replication=2. 




{code}
[root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/cassandra-cli -h $HOSTNAME 
-p 9160 -f increments.cql -B /dev/null 
[root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME drain

[root@cass-bug2 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME ring
Address DC  RackStatus State   LoadOwns
Token   
   
85070591730234615865843651857942052864  
10.20.30.160datacenter1 rack1   Down   Normal  97.67 KB50.00%  
0   
10.20.30.161datacenter1 rack1   Up Normal  113.45 KB   50.00%  
85070591730234615865843651857942052864  

[root@cass-bug1 ~]# killall java
[root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/cassandra

[root@cass-bug2 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME drain

[root@cass-bug1 ~]# /opt/apache-cassandra-1.0.10/bin/nodetool -h $HOSTNAME ring
Address DC  RackStatus State   LoadOwns
Token   
   
85070591730234615865843651857942052864  
10.20.30.160datacenter1 rack1   Up Normal  97.67 KB50.00%  
0   
10.20.30.161datacenter1 rack1   Down   Normal  86.13 KB50.00%  
85070591730234615865843651857942052864 

[root@cass-bug2 ~]# killall java
[root@cass-bug2 ~]# /opt/apache-cassandra-1.0.10/bin/cassandra


{code}



Here is dump of keyspace and CF 


{code}
create keyspace inc_test
  with placement_strategy = 'SimpleStrategy'
  and strategy_options = {replication_factor : 2}
  and durable_writes = true;

use inc_test;

create column family cf1_increment
  with column_type = 'Standard'
  and comparator = 'BytesType'
  and default_validation_class = 'CounterColumnType'
  and key_validation_class = 'BytesType'
  and rows_cached = 0.0
  and row_cache_save_period = 0
  and row_cache_keys_to_save = 2147483647
  and keys_cached = 20.0
  and key_cache_save_period = 14400
  and read_repair_chance = 1.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and row_cache_provider = 'SerializingCacheProvider'
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy';
{code}


Hope that helps you reproduce ..

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas
 Attachments: increments.cql.gz


 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-14 Thread Peter Velas (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414315#comment-13414315
 ] 

Peter Velas commented on CASSANDRA-4436:




create keyspace test_old
  with placement_strategy = 'SimpleStrategy'
  and strategy_options = {replication_factor : 2}
  and durable_writes = true;

use test_old;

create column family cf1_increment
  with column_type = 'Standard'
  and comparator = 'BytesType'
  and default_validation_class = 'CounterColumnType'
  and key_validation_class = 'BytesType'
  and read_repair_chance = 1.0
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 4
  and max_compaction_threshold = 32
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and compression_options = {'sstable_compression' : 
'org.apache.cassandra.io.compress.SnappyCompressor'};



In version 1.0.10 am always able to reproduce with this steps.. but its not 
reproducible in 1.1.2 .

When I stop writing and shutdown node with nodetool drain there are some 
small commitlog files, but I don't bother to delete them just restart cassandra 
process. Maybe this is case ?

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas

 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4436) Counters in columns don't preserve correct values after cluster restart

2012-07-13 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413881#comment-13413881
 ] 

Sylvain Lebresne commented on CASSANDRA-4436:
-

Can you reproduce every time with those steps? I tried reproducing with those 
exact steps (as far as I can tell) a few times on both 1.0 and 1.1 (the counter 
code didn't change much between 1.0 and 1.1) and wasn't able to reproduce.

 Counters in columns don't preserve correct values after cluster restart
 ---

 Key: CASSANDRA-4436
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4436
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.10
Reporter: Peter Velas

 Similar to #3821. but affecting normal columns. 
 Set up a 2-node cluster with rf=2.
 1. Create a counter column family and increment a 100 keys in loop 5000 
 times. 
 2. Then make a rolling restart to cluster. 
 3. Again increment another 5000 times.
 4. Make a rolling restart to cluster.
 5. Again increment another 5000 times.
 6. Make a rolling restart to cluster.
 After step 6 we were able to reproduce bug with bad counter values. 
 Expected values were 15 000. Values returned from cluster are higher then 
 15000 + some random number.
 Rolling restarts are done with nodetool drain. Always waiting until second 
 node discover its down then kill java process. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira