[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306835#comment-14306835 ] mlowicki commented on CASSANDRA-8712: - [~slebresne] don't have repro steps yet. What I've found on our production though is that index returns always (17340/17340 cases) superset of what we get from table directly without supporting index. After reading www.datastax.com/dev/blog/improving-secondary-index-write-performance-in-1-2 I would suspect that there is problem with removing stale items from the index. What do you think? Should {{rebuild_index}} help with such issue or it just re-adds missing items and do not remove old ones? Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301096#comment-14301096 ] Sylvain Lebresne commented on CASSANDRA-8712: - I'm not familiar with django-cassandra-engine and I'm not sure other Cassandra devs are, so it would be much simpler to limit the layer used to reproduce (to limit the possibility that the problem actually come from one of those layers). Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301074#comment-14301074 ] mlowicki commented on CASSANDRA-8712: - 1. Drop keyspace {code} cqlsh use sync; cqlsh:sync drop keyspace sync; cqlsh:sync {code} 2. Creating keyspace from scratch (I'm using sync_casandra from django-cassandra-engine) {code} ./bin/django sync_cassandra Creating keyspace sync.. Syncing sync.api.models.Entity Syncing sync.api.models.UserStore {code} 3. Populate database using Django's shell {code} from sync.api.models import Entity, UserStore user = UserStore.objects.create(user_id='foo') root = Entity.objects.create(user_id='foo', data_type_id=0, version=0, id='-1') {code} 4. Run {{check_parent_index_consistency}} script: {code} ./bin/django check_parent_index_consistency { folder: 1, user: 1 } {code} 5. Add entities to root folder {code} for i in range(1): Entity.objects.create(user_id='foo', data_type_id=0, version=0, id='a' + str(i), parent_id='-1', folder=False) {code} 6. While inserting run {{check_parent_index_consistency}} script: {code} ./bin/django check_parent_index_consistency { folder: 1, inconsistent_folder: 1, user: 1 } {code} Number of entities returned directly from {{entity}} while running insert was 8918 but got only 372 from index. It seems to be related to number of entities I'm adding. If less than 1 I couldn't reproduce the issue. When running {{check_parent_index_consistency}} script after couple of minutes it was completely fine - no inconsistencies. Not sure if this is the same issue as number of inconsistencies is zero after some time but maybe it'll help. {{check_parent_index_consistency}} is available on https://cpaste.org/p7zht9rli Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301161#comment-14301161 ] mlowicki commented on CASSANDRA-8712: - I'll try to provide sth soon. We've checked and {{rebuild_index}} doesn't help at all. Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8712) Out-of-sync secondary index
[ https://issues.apache.org/jira/browse/CASSANDRA-8712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300987#comment-14300987 ] Sylvain Lebresne commented on CASSANDRA-8712: - I think we'd need some kind of reproduction steps/script to make any kind of progress on this. Out-of-sync secondary index --- Key: CASSANDRA-8712 URL: https://issues.apache.org/jira/browse/CASSANDRA-8712 Project: Cassandra Issue Type: Bug Environment: 2.1.2 Reporter: mlowicki Fix For: 2.1.3 I've such table with index: {code} CREATE TABLE entity ( user_id text, data_type_id int, version bigint, id text, cache_guid text, client_defined_unique_tag text, ctime timestamp, deleted boolean, folder boolean, mtime timestamp, name text, originator_client_item_id text, parent_id text, position blob, server_defined_unique_tag text, specifics blob, PRIMARY KEY (user_id, data_type_id, version, id) ) WITH CLUSTERING ORDER BY (data_type_id ASC, version ASC, id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = '{keys:ALL, rows_per_partition:NONE}' AND comment = '' AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX index_entity_parent_id ON entity (parent_id); {code} It turned out that index became out of sync: {code} Entity.objects.filter(user_id='255824802', parent_id=parent_id).consistency(6).count() 16 counter = 0 for e in Entity.objects.filter(user_id='255824802'): ... if e.parent_id and e.parent_id == parent_id: ... counter += 1 ... counter 10 {code} After couple of hours it was fine (at night) but then when user probably started to interact with DB we got the same problem. As a temporary solution we'll try to rebuild indexes from time to time as suggested in http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/ Launched simple script for checking such anomaly and before rebuilding index for 4024856 folders 10378 had this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)