[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301096#comment-14301096 ]
Sylvain Lebresne commented on CASSANDRA-8712: --------------------------------------------- I'm not familiar with django-cassandra-engine and I'm not sure other Cassandra devs are, so it would be much simpler to limit the layer used to reproduce (to limit the possibility that the problem actually come from one of those layers). > Out-of-sync secondary index > --------------------------- > > Key: CASSANDRA-8712 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 > Project: Cassandra > Issue Type: Bug > Environment: 2.1.2 > Reporter: mlowicki > Fix For: 2.1.3 > > > I've such table with index: > {code} > CREATE TABLE entity ( > user_id text, > data_type_id int, > version bigint, > id text, > cache_guid text, > client_defined_unique_tag text, > ctime timestamp, > deleted boolean, > folder boolean, > mtime timestamp, > name text, > originator_client_item_id text, > parent_id text, > position blob, > server_defined_unique_tag text, > specifics blob, > PRIMARY KEY (user_id, data_type_id, version, id) > ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '4', 'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > CREATE INDEX index_entity_parent_id ON entity (parent_id); > {code} > It turned out that index became out of sync: > {code} > >>> Entity.objects.filter(user_id='255824802', > >>> parent_id=parent_id).consistency(6).count() > 16 > > >>> counter = 0 > >>> for e in Entity.objects.filter(user_id='255824802'): > ... if e.parent_id and e.parent_id == parent_id: > ... counter += 1 > ... > >>> counter > 10 > {code} > After couple of hours it was fine (at night) but then when user probably > started to interact with DB we got the same problem. As a temporary solution > we'll try to rebuild indexes from time to time as suggested in > http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ > Launched simple script for checking such anomaly and before rebuilding index > for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)