[ https://issues.apache.org/jira/browse/CASSANDRA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321703#comment-15321703 ]
Chen Shen edited comment on CASSANDRA-10862 at 6/9/16 5:46 PM: --------------------------------------------------------------- [~pauloricardomg] I've done some investigation and I find it might not so easy to schedule a compaction on L0 table on reception as the only straightforward way to trigger a compaction is by submitting a task to CompactionManager.submitBackground, and 1) it's not guaranteed to be executed according to my knowledge 2) submitBackground need a `ColumnFamilyStore` as input, so we need either create a new CFS, or split the compaction strategy out of CompactionManager, each of which might need lots of work. So instead I am doing a different tricky approach: Don't add tables to CFS until the number of L0 sstables is smaller than a threshold. And subscribe to `SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep and wait on notification. Is this a right direction? I have a commit here https://github.com/scv119/cassandra/commit/149d127c76f8f4e267524ed7f642d2ffdf6188e5 if you want to take a look. I'm also planing to apply this patch to our production tier to see if this helps. was (Author: scv...@gmail.com): [~pauloricardomg] I've done some investigation and I find it might not so easy to schedule a compaction on L0 table on reception as the only straightforward way to trigger a compaction is by submitting a task to CompactionManager.submitBackground, and 1) it's not guaranteed to be executed according to my knowledge 2) submitBackground need a `ColumnFamilyStore` as input, so we need either create a new CFS, or split the compaction strategy out of CompactionManager, each of which might need lots of work. So instead I am doing a different tricky approach: Don't add tables to CFS until the number of L0 sstables is smaller than a threshold. And subscribe to `SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep and wait on notification. Is this a right direction? I have a commit here https://github.com/scv119/cassandra/commit/3b48c092a7381d3074086476b12570db9b16dc16 if you want to take a look. I'm also planing to apply this patch to our production tier to see if this helps. > LCS repair: compact tables before making available in L0 > -------------------------------------------------------- > > Key: CASSANDRA-10862 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10862 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Streaming and Messaging > Reporter: Jeff Ferland > Assignee: Chen Shen > > When doing repair on a system with lots of mismatched ranges, the number of > tables in L0 goes up dramatically, as correspondingly goes the number of > tables referenced for a query. Latency increases dramatically in tandem. > Eventually all the copied tables are compacted down in L0, then copied into > L1 (which may be a very large copy), finally reducing the number of SSTables > per query into the manageable range. > It seems to me that the cleanest answer is to compact after streaming, then > mark tables available rather than marking available when the file itself is > complete. -- This message was sent by Atlassian JIRA (v6.3.4#6332)