Re: counters still inconsistent after repair
Thanks Rob, this was helpful. More counters will be added soon, I'll let you know if those have any problems. On Mon, Jun 15, 2015 at 4:32 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jun 15, 2015 at 2:52 PM, Dan Kinder dkin...@turnitin.com wrote: Potentially relevant facts: - Recently upgraded to 2.1.6 from 2.0.14 - This table has ~million rows, low contention, and fairly high increment rate Can you repro on a counter that was created after the upgrade? Mainly wondering: - Is this known or expected? I know Cassandra counters have had issues but thought by now it should be able to keep a consistent counter or at least repair it... All counters which haven't been written to after 2.1 new counters are still on disk as old counters and will remain that way until UPDATEd and then compacted together with all old shards. Old counters can exhibit this behavior. - Any way to reset this counter? Per Aleksey (in IRC) you can turn a replica for an old counter into a new counter by UPDATEing it once. In order to do that without modifying the count, you can [1] : UPDATE tablename SET countercolumn = countercolumn +0 where id = 1; The important caveat that this must be done at least once per shard, with one shard per RF. The only way one can be sure that all shards have been UPDATEd is by contacting each replica node and doing the UPDATE + 0 there, because local writes are preferred. To summarize, the optimal process to upgrade your pre-existing counters to 2.1-era new counters : 1) get a list of all counter keys 2) get a list of replicas per counter key 3) connect to each replica for each counter key and issue an UPDATE + 0 for that counter key 4) run a major compaction As an aside, Aleksey suggests that the above process is so heavyweight that it may not be worth it. If you just leave them be, all counters you're actually used will become progressively more accurate over time. =Rob [1] Special thanks to Jeff Jirsa for verifying that this syntax works. -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
counters still inconsistent after repair
Currently on 2.1.6 I'm seeing behavior like the following: cqlsh:walker select * from counter_table where field = 'test'; field | value ---+--- test |30 (1 rows) cqlsh:walker select * from counter_table where field = 'test'; field | value ---+--- test |90 (1 rows) cqlsh:walker select * from counter_table where field = 'test'; field | value ---+--- test |30 (1 rows) Using tracing I can see that one node has wrong data. However running repair on this table does not seem to have done anything, I still see the wrong value returned from this same node. Potentially relevant facts: - Recently upgraded to 2.1.6 from 2.0.14 - This table has ~million rows, low contention, and fairly high increment rate Mainly wondering: - Is this known or expected? I know Cassandra counters have had issues but thought by now it should be able to keep a consistent counter or at least repair it... - Any way to reset this counter? - Any other stuff I can check?
Re: counters still inconsistent after repair
On Mon, Jun 15, 2015 at 2:52 PM, Dan Kinder dkin...@turnitin.com wrote: Potentially relevant facts: - Recently upgraded to 2.1.6 from 2.0.14 - This table has ~million rows, low contention, and fairly high increment rate Can you repro on a counter that was created after the upgrade? Mainly wondering: - Is this known or expected? I know Cassandra counters have had issues but thought by now it should be able to keep a consistent counter or at least repair it... All counters which haven't been written to after 2.1 new counters are still on disk as old counters and will remain that way until UPDATEd and then compacted together with all old shards. Old counters can exhibit this behavior. - Any way to reset this counter? Per Aleksey (in IRC) you can turn a replica for an old counter into a new counter by UPDATEing it once. In order to do that without modifying the count, you can [1] : UPDATE tablename SET countercolumn = countercolumn +0 where id = 1; The important caveat that this must be done at least once per shard, with one shard per RF. The only way one can be sure that all shards have been UPDATEd is by contacting each replica node and doing the UPDATE + 0 there, because local writes are preferred. To summarize, the optimal process to upgrade your pre-existing counters to 2.1-era new counters : 1) get a list of all counter keys 2) get a list of replicas per counter key 3) connect to each replica for each counter key and issue an UPDATE + 0 for that counter key 4) run a major compaction As an aside, Aleksey suggests that the above process is so heavyweight that it may not be worth it. If you just leave them be, all counters you're actually used will become progressively more accurate over time. =Rob [1] Special thanks to Jeff Jirsa for verifying that this syntax works.