[ 
https://issues.apache.org/jira/browse/CASSANDRA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321703#comment-15321703
 ] 

Chen Shen edited comment on CASSANDRA-10862 at 6/9/16 2:03 AM:
---------------------------------------------------------------

[~pauloricardomg]
I've done some investigation and I find it might not so easy to schedule a 
compaction on L0 table on reception as the only straightforward way to trigger 
a compaction is by submitting a task to CompactionManager.submitBackground, and 
1) it's not guaranteed to be executed according to my knowledge 2) 
submitBackground need a `ColumnFamilyStore` as input, so we need either create 
a new CFS, or split the compaction strategy out of CompactionManager, each of 
which might need lots of work.

So instead I am doing a different tricky approach: Don't add tables to CFS 
until the number of L0 sstables is smaller than a threshold. And subscribe to 
`SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep 
and wait on notification. 

Is this a right direction? I have a commit here 
https://github.com/scv119/cassandra/commit/3b48c092a7381d3074086476b12570db9b16dc16
 if you want to take a look. I'm also planing to apply this patch to our 
production tier to see if this helps.
 


was (Author: scv...@gmail.com):
[~pauloricardomg]
I've done some investigation and I find it might not so easy to schedule a 
compaction on L0 table on reception as the only straightforward way to trigger 
a compaction is by submitting a task to CompactionManager.submitBackground, and 
1) it's not guaranteed to be executed according to my knowledge 2) 
submitBackground need a `ColumnFamilyStore` as input, so we need either create 
a new CFS, or split the compaction strategy out of CompactionManager, each of 
which might need lots of work.

So instead I am doing a different tricky approach: Don't add tables to CFS 
until the number of L0 sstables is smaller than a threshold. And subscribe to 
`SSTableListChangedNotification` so that the `OnCompletionRunnable` could sleep 
and wait on notification. 

Is this a right direction? I have a commit here 
https://github.com/scv119/cassandra/commit/f49013897b1694e006e001df97c6f34399d016ae
 if you want to take a look. I'm also planing to apply this patch to our 
production tier to see if this helps.
 

> LCS repair: compact tables before making available in L0
> --------------------------------------------------------
>
>                 Key: CASSANDRA-10862
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10862
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction, Streaming and Messaging
>            Reporter: Jeff Ferland
>            Assignee: Chen Shen
>
> When doing repair on a system with lots of mismatched ranges, the number of 
> tables in L0 goes up dramatically, as correspondingly goes the number of 
> tables referenced for a query. Latency increases dramatically in tandem.
> Eventually all the copied tables are compacted down in L0, then copied into 
> L1 (which may be a very large copy), finally reducing the number of SSTables 
> per query into the manageable range.
> It seems to me that the cleanest answer is to compact after streaming, then 
> mark tables available rather than marking available when the file itself is 
> complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to