[ https://issues.apache.org/jira/browse/CASSANDRA-12962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943400#comment-15943400 ]
Alex Petrov commented on CASSANDRA-12962: ----------------------------------------- Composed a simple patch that would put an empty file in place of an index that doesn't hold any values and ensure that it doesn't participate neither in query process nor is attempted to be rebuilt every restart. |[patch|https://github.com/ifesdjeen/cassandra/tree/12962-trunk]|[utest|https://cassci.datastax.com/job/ifesdjeen-12962-trunk-testall/]|[dtest|https://cassci.datastax.com/job/ifesdjeen-12962-trunk-dtest/]| > SASI: Index are rebuilt on restart > ---------------------------------- > > Key: CASSANDRA-12962 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12962 > Project: Cassandra > Issue Type: Bug > Components: sasi > Reporter: Corentin Chary > Assignee: Alex Petrov > Priority: Minor > Fix For: 3.11.x > > Attachments: screenshot-1.png > > > Apparently when cassandra any index that does not index a value in *every* > live SSTable gets rebuild. The offending code can be found in the constructor > of SASIIndex. > You can easilly reproduce it: > {code} > CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': '1'} AND durable_writes = true; > CREATE TABLE test.test ( > a text PRIMARY KEY, > b text, > c text > ) WITH bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX test_b_idx ON test.test (b) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > CREATE CUSTOM INDEX test_c_idx ON test.test (c) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > INSERT INTO test.test (a, b) VALUES ('a', 'b'); > {code} > Log (I added additional traces): > {code} > INFO [main] 2016-11-28 15:32:21,191 ColumnFamilyStore.java:406 - > Initializing test.test > DEBUG [SSTableBatchOpen:1] 2016-11-28 15:32:21,192 SSTableReader.java:505 - > Opening > /mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big > (0.034KiB) > DEBUG [main] 2016-11-28 15:32:21,194 SASIIndex.java:118 - index: > org.apache.cassandra.schema.IndexMetadata@2f661b1a[id=6b00489b-7010-396e-9348-9f32f5167f88,name=test_b_idx,kind=CUSTOM,options={class_name=org.a\ > pache.cassandra.index.sasi.SASIIndex, target=b}], base CFS(Keyspace='test', > ColumnFamily='test'), tracker > org.apache.cassandra.db.lifecycle.Tracker@15900b83 > INFO [main] 2016-11-28 15:32:21,194 DataTracker.java:152 - > SSTableIndex.open(column: b, minTerm: value, maxTerm: value, minKey: key, > maxKey: key, sstable: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test\ > -229e6380b57711e68407158fde22e121/mc-1-big-Data.db')) > DEBUG [main] 2016-11-28 15:32:21,195 SASIIndex.java:129 - Rebuilding SASI > Indexes: {} > DEBUG [main] 2016-11-28 15:32:21,195 ColumnFamilyStore.java:895 - Enqueuing > flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap > DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204 > Memtable.java:465 - Writing Memtable-IndexInfo@748981977(0.054KiB serialized > bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\ > 372036854775808), max(9223372036854775807)] > DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204 > Memtable.java:494 - Completed flushing > /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db > (0.035KiB) for\ > commitlog position CommitLogPosition(segmentId=1480343535479, position=15652) > DEBUG [MemtableFlushWriter:1] 2016-11-28 15:32:21,224 > ColumnFamilyStore.java:1200 - Flushed to > [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db\ > ')] (1 sstables, 4.838KiB), biggest 4.838KiB, smallest 4.838KiB > DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:118 - index: > org.apache.cassandra.schema.IndexMetadata@12f3d291[id=45fcb286-b87a-3d18-a04b-b899a9880c91,name=test_c_idx,kind=CUSTOM,options={class_name=org.a\ > pache.cassandra.index.sasi.SASIIndex, target=c}], base CFS(Keyspace='test', > ColumnFamily='test'), tracker > org.apache.cassandra.db.lifecycle.Tracker@15900b83 > DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:121 - to rebuild: index: > BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'), > sstable: org.apache.cassa\ > ndra.index.sasi.conf.ColumnIndex@6cbb6b0e > DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:129 - Rebuilding SASI > Indexes: > {BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db')={c=org.apache.cassa\ > ndra.index.sasi.conf.ColumnIndex@6cbb6b0e}} > DEBUG [main] 2016-11-28 15:32:21,225 ColumnFamilyStore.java:895 - Enqueuing > flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap > DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235 > Memtable.java:465 - Writing Memtable-IndexInfo@951411443(0.054KiB serialized > bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\ > 372036854775808), max(9223372036854775807)] > DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235 > Memtable.java:494 - Completed flushing > /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db > (0.035KiB) for\ > commitlog position CommitLogPosition(segmentId=1480343535479, position=15720) > DEBUG [MemtableFlushWriter:2] 2016-11-28 15:32:21,254 > ColumnFamilyStore.java:1200 - Flushed to > [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db\ > ')] (1 sstables, 4.836KiB), biggest 4.836KiB, smallest 4.836KiB > {code} > I think a better behavior would be to ask users to explicitly rebuild indexes > if they remove the files, that's fine as long as we handle correctly the case > of new indexes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)