[ https://issues.apache.org/jira/browse/CASSANDRA-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650762#comment-13650762 ]
Sam Tunnicliffe commented on CASSANDRA-5540: -------------------------------------------- I don't think this is caused by the index updates in KeysSearcher. There, we only compare the values & since this test always writes the same values the index entry is never deemed stale, and so we don't ever write a tombstone. The test script does reproduce the issue completely reliably though, so I'll dig in and find the actual cause. > Concurrent secondary index updates remove rows from the index > ------------------------------------------------------------- > > Key: CASSANDRA-5540 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5540 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.2.4 > Reporter: Alexei Bakanov > > Existing rows disappear from secondary index when doing simultaneous updates > of a row with the same secondary index value. > Here is a little pycassa script that reproduces a bug. The script inserts 4 > rows with same secondary index value, reads those rows back and check that > there are 4 of them. > Please run two instances of the script simultaneously in two separate > terminals in order to simulate concurrent updates: > {code} > -----scrpit.py START----- > import pycassa > from pycassa.index import * > pool = pycassa.ConnectionPool('ks123') > cf = pycassa.ColumnFamily(pool, 'cf1') > while True: > for rowKey in xrange(4): > cf.insert(str(rowKey), {'indexedColumn': 'indexedValue'}) > index_expression = create_index_expression('indexedColumn', > 'indexedValue') > index_clause = create_index_clause([index_expression]) > rows = cf.get_indexed_slices(index_clause) > length = len(list(rows)) > if length == 4: > pass > else: > print 'found just %d rows out of 4' % length > pool.dispose() > ---script.py FINISH--- > ---schema cli start--- > create keyspace ks123 > with placement_strategy = 'NetworkTopologyStrategy' > and strategy_options = {datacenter1 : 1} > and durable_writes = true; > use ks123; > create column family cf1 > with column_type = 'Standard' > and comparator = 'AsciiType' > and default_validation_class = 'AsciiType' > and key_validation_class = 'AsciiType' > and read_repair_chance = 0.1 > and dclocal_read_repair_chance = 0.0 > and populate_io_cache_on_flush = false > and gc_grace = 864000 > and min_compaction_threshold = 4 > and max_compaction_threshold = 32 > and replicate_on_write = true > and compaction_strategy = > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' > and caching = 'KEYS_ONLY' > and column_metadata = [ > {column_name : 'indexedColumn', > validation_class : AsciiType, > index_name : 'INDEX1', > index_type : 0}] > and compression_options = {'sstable_compression' : > 'org.apache.cassandra.io.compress.SnappyCompressor'}; > ---schema cli finish--- > {code} > Test cluster created with 'ccm create --cassandra-version 1.2.4 --nodes 1 > --start testUpdate' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira