[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14567335#comment-14567335 ] Anuj Wadehra commented on CASSANDRA-9146: - Yes. I think when data is inconsistent, repair creates huge number of tiny sstables. But, how to avoid this in a running system writing data at QUORUM? When repair is run, in such a system, you will always get some data which is present on two nodes and absent on the third. And from what I understand, Cassandra will then try to write tiny sstables to repair such data ..one sstable representing one range..I think this is a problem. Why memtables are flushed so frequently during repair? How can we club repair from multiple ranges in fewer sstables ? I think https://issues.apache.org/jira/browse/CASSANDRA-9491 is same as above stated problem. Moreover, I think that in our case , tiny sstables are not getting compacted because of cold_read_to_omit issue in 2.0.3 ( https://issues.apache.org/jira/browse/CASSANDRA-8885 ). Do we have a workaround to get rid of thousands of sstables in 2.0.3 (till we upgrade)? It's very CRITICAL issue for people, till they upgrade. Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Wadehra Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511167#comment-14511167 ] Marcus Eriksson commented on CASSANDRA-9146: with vnodes you will get many sstables after repair you should also upgrade to latest 2.0-version and check if this is fixed there Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503307#comment-14503307 ] Anuj commented on CASSANDRA-9146: - Yes we use vnodes.We havent changed cold_reads_to_ommit Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503070#comment-14503070 ] Marcus Eriksson commented on CASSANDRA-9146: vnodes? have you changed cold_reads_to_omit? Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503028#comment-14503028 ] Anuj commented on CASSANDRA-9146: - Marcus Are you looking into this? Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Attachments: sstables.txt, system-modified.log Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487358#comment-14487358 ] Philip Thompson commented on CASSANDRA-9146: To clarify, you are running on 2.0.3? Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Priority: Blocker Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487372#comment-14487372 ] Anuj commented on CASSANDRA-9146: - Yes, 2.0.3. Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487387#comment-14487387 ] Philip Thompson commented on CASSANDRA-9146: Okay. [~krummas] can confirm, but I believe CASSANDRA-8641 only affected 2.1. You will need to upgrade if the issue really is CASSANDRA-8885. 2.0.3 is old, let us know if you're able to reproduce after upgrading. Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487406#comment-14487406 ] Anuj commented on CASSANDRA-9146: - Yes please analyze as we are facing it in 2.0.3.We may need this fix in 2.0 branch.Upgrading would take some time and scenario is not easily reproducible but once it occurs in a cluster this burst of sstables grows with every repair.We need to understand why this premature flushing is happenning? Whats the work around till we upgrade? If its coldness issue which prevents auto compaction then any advise to make all sstables hot ( we tried with reads but without success)? Range scan doesnt contribute to hotness ..thats another open issue I logged sometime back. Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487411#comment-14487411 ] Philip Thompson commented on CASSANDRA-9146: Yes, range scans contributing to hotness is being discussed in CASSANDRA-8938. As far as how to mitigate the issue, you're likely to get more advice on the user mailing list than here. Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9146) Ever Growing Secondary Index sstables after every Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14487385#comment-14487385 ] Anuj commented on CASSANDRA-9146: - Small correction at max one gc greater than 200ms PER MINUTE. Ever Growing Secondary Index sstables after every Repair Key: CASSANDRA-9146 URL: https://issues.apache.org/jira/browse/CASSANDRA-9146 Project: Cassandra Issue Type: Bug Components: Core Reporter: Anuj Cluster has reached a state where every repair -pr operation on CF results in numerous tiny sstables being flushed to disk. Most sstables are related to secondary indexes. Due to thousands of sstables, reads have started timing out. Even though compaction begins for one of the secondary index, sstable count after repair remains very high (thousands). Every repair adds thousands of sstables. Problems: 1. Why burst of tiny secondary index tables are flushed during repair ? What is triggering frequent/premature flush of secondary index sstable (more than hundred in every burst)? At max we see one ParNew GC pauses 200ms. 2. Why auto-compaction is not compacting all sstables. Is it related to coldness issue(CASSANDRA-8885) where compaction doesn't works even when cold_reads_to_omit=0 by default? If coldness is the issue, we are stuck in infinite loop: reads will trigger compaction but reads timeout as sstable count is in thousands 3. What's the way out if we face this issue in Prod? Is this issue fixed in latest production release 2.0.13? Issue looks similar to CASSANDRA-8641, but the issue is fixed in only 2.1.3. I think it should be fixed in 2.0 branch too. Configuration: Compaction Strategy: STCS memtable_flush_writers=4 memtable_flush_queue_size=4 in_memory_compaction_limit_in_mb=32 concurrent_compactors=12 -- This message was sent by Atlassian JIRA (v6.3.4#6332)