[ https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Schuller updated CASSANDRA-3070: -------------------------------------- Comment: was deleted (was: This may be relevant, quoting myself from IRC: {code} 21:20:01 < scode> pcmanus: Hey, are you there? 21:20:21 < scode> pcmanus: I am investigating something which might be https://issues.apache.org/jira/browse/CASSANDRA-3070 21:20:37 < scode> pcmanus: And I could use the help of someone with his brain all over counters, and Stu isn't here atm. :) 21:21:16 < scode> pcmanus: https://gist.github.com/8202cb46c8bd00c8391b 21:21:37 < scode> pcmanus: I am investigating why with CL.ALL and CL.QUORUM, I get seemingly random/varying results when I read a counter. 21:21:53 < scode> pcmanus: I have the offending sstables on a three-node test setup and am inserting debug printouts in the code to trace the reconiliation. 21:21:57 < scode> pcmanus: The gist above shows what's happening. 21:22:11 < scode> pcmanus: The latter is the wrong one, and the former is the correct one. 21:22:28 < scode> pcmanus: The interesting bit is that I see shards with the same node_id *AND* clock, but *DIFFERENT* counts. 21:22:53 < scode> pcmanus: My understanding of counters is that there should never (globally across an entire cluster in all sstables) exist two shards for the same node_id+clock but with different counts. 21:22:57 < scode> pcmanus: Is my understanding correct there? 21:25:10 < scode> pcmanus: There is one node out of the three that has the "offending" card (with a count of 2 instead of 1). Like with 3070, we observed this after having expanded a cluster (though I'm not sure how that would cause it, and we don't know if there existed a problem before the expansion). {code} ) > counter repair > -------------- > > Key: CASSANDRA-3070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3070 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.8.4 > Reporter: ivan > Assignee: Sylvain Lebresne > Attachments: counter_local_quroum_maybeschedulerepairs.txt, > counter_local_quroum_maybeschedulerepairs_2.txt, > counter_local_quroum_maybeschedulerepairs_3.txt > > > Hi! > We have some counters out of sync but repair doesn't sync values. > We tried nodetool repair. > We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes > while reading a bad row but counters wasn't repaired by mutation. > Output of two nodes were uploaded. (Some new debug messages were added.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira