[
https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anuj updated CASSANDRA-9146:
Attachment: sstables.txt
system-modified.log
Please find attached the logs:
1. system-modified.log = system logs
2. sstables.txt = listing of sstables in ks1cf1 column family in test_ks1
Keyspace
Repair -pr was running on node on 3 instances everytime creating numerous
sstables every second:
2015-04-09 09:14:36 TO 2015-04-09 12:07:28
2015-04-09 14:34 (stopped at 15:07)
2015-04-09 15:11
While only 42 sstables exist for ks1cf1Idx3 as it was compacting
regularly..other two indexes ks1cf1Idx1 and ks1cf1Idx2 have 8932 sstables.
Ever Growing Secondary Index sstables after every Repair
Key: CASSANDRA-9146
URL: https://issues.apache.org/jira/browse/CASSANDRA-9146
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Anuj
Attachments: sstables.txt, system-modified.log
Cluster has reached a state where every repair -pr operation on CF results
in numerous tiny sstables being flushed to disk. Most sstables are related to
secondary indexes. Due to thousands of sstables, reads have started timing
out. Even though compaction begins for one of the secondary index, sstable
count after repair remains very high (thousands). Every repair adds thousands
of sstables.
Problems:
1. Why burst of tiny secondary index tables are flushed during repair ? What
is triggering frequent/premature flush of secondary index sstable (more than
hundred in every burst)? At max we see one ParNew GC pauses 200ms.
2. Why auto-compaction is not compacting all sstables. Is it related to
coldness issue(CASSANDRA-8885) where compaction doesn't works even when
cold_reads_to_omit=0 by default?
If coldness is the issue, we are stuck in infinite loop: reads will
trigger compaction but reads timeout as sstable count is in thousands
3. What's the way out if we face this issue in Prod?
Is this issue fixed in latest production release 2.0.13? Issue looks similar
to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be
fixed in 2.0 branch too.
Configuration:
Compaction Strategy: STCS
memtable_flush_writers=4
memtable_flush_queue_size=4
in_memory_compaction_limit_in_mb=32
concurrent_compactors=12
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)