[ https://issues.apache.org/jira/browse/CASSANDRA-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sylvain Lebresne updated CASSANDRA-3178: ---------------------------------------- Attachment: 0002-Simplify-improve-shard-merging-code-v2.patch 0001-Move-shard-merging-completely-to-compaction-v2.patch Attaching v2, rebased and that remove the use of flush_after_mins. > Counter shard merging is not thread safe > ---------------------------------------- > > Key: CASSANDRA-3178 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3178 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.8.5 > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Labels: counters > Fix For: 0.8.6 > > Attachments: > 0001-Move-shard-merging-completely-to-compaction-v2.patch, > 0001-Move-shard-merging-completely-to-compaction.patch, > 0002-Simplify-improve-shard-merging-code-v2.patch, > 0002-Simplify-improve-shard-merging-code.patch > > > The first part of the counter shard merging process is done during counter > replication. This was done there because it requires that all replica are > made aware of the merging (we could only rely on nodetool repair for that but > that seems much too fragile, it's better as just a safety net). However this > part isn't thread safe as multiple threads can do the merging for the same > shard at the same time (which shouldn't really "corrupt" the counter value > per se, but result in an incorrect context). > Synchronizing that part of the code would be very costly in term of > performance, so instance I propose to move the part of the shard merging done > during replication to compaction. It's a better place anyway. The only > downside is that it means compaction will sometime send mutations to other > node as a side effect, which doesn't feel very clean but is probably not a > big deal either. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira