[ 
https://issues.apache.org/jira/browse/CASSANDRA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202167#comment-13202167
 ] 

Sylvain Lebresne commented on CASSANDRA-3821:
---------------------------------------------

Damn you super columns, there is nothing super about you!

Yuki is right that super columns are a problem for AtomicSortedColumns. Columns 
are immutable, so for them the multiple reconcile of ASC is not a problem since 
it's done on a cloned underlying map. But SuperColumns are not immutable. So 
even though the ASC (potential) mutliple attempts are made on a cloned map, the 
super columns structure themselves are not cloned, and so all these attempts 
modify the original SC. This does mean that SC are not isolated (I did knew 
about that, but somehow forget about it and certainly didn't realize the 
consequence on counters).

Anyway, I don't see an easy one line fix, so I would suggest to rather add back 
the ThreadSafeSortedColumns backing map and to use that for super column 
family, since super column family are not really isolated anyway. Then I guess, 
I'll make CASSANDRA-3237 a higher priority on my todo list.

bq. This problem doesn't happen on C* 1.0.7, unless you don't sleep between 
doing the increments and killing the cluster. Then it sometimes happens to a 
lesser degree.

The explanation above doesn't explain that. It would be worth investigating 
that separately (maybe a separate ticket).
                
> Counters in super columns don't preserve correct values after cluster restart
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3821
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3821
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>         Environment: ubuntu, 'trunk' branch, used ccm to create a 3 node 
> cluster with rf=3. A dtest was created to demonstrate.
>            Reporter: Tyler Patterson
>
> Set up a 3-node cluster with rf=3. Create a counter super column family and 
> increment a bunch of subcolumns 100 times each, with cf=QUORUM. Then wait a 
> few second, restart the cluster, and read the values back. They almost all 
> come back different (and higher) then they are supposed to be.
> Here are some extra things I've noticed:
>  - Reading back the values before the restart always produces correct results.
>  - Doing a nodetool flush before killing the cluster greatly improves the 
> results, though sometimes a value will still be incorrect. You might have to 
> run the test several times to see an incorrect value after a flush.
>  - This problem doesn't happen on C* 1.0.7, unless you don't sleep between 
> doing the increments and killing the cluster. Then it sometimes happens to a 
> lesser degree.
> The dtest that demonstrates this issue is called "super_counter_test.py". Run 
> it like this: nosetests --nocapture super_counter_test.py  You'll need ccm 
> from g...@github.com:tpatterson/ccm.git.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to