[ 
https://issues.apache.org/jira/browse/CASSANDRA-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2191:
--------------------------------

    Attachment:     (was: 0001-Add-a-compacting-set-to-DataTracker.txt)

> Multithread across compaction buckets
> -------------------------------------
>
>                 Key: CASSANDRA-2191
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2191
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>            Priority: Critical
>              Labels: compaction
>             Fix For: 0.8
>
>         Attachments: 0001-Add-a-compacting-set-to-DataTracker.txt, 
> 0002-Use-the-compacting-set-of-sstables-to-schedule-multith.txt, 
> 0003-Expose-multiple-compactions-via-JMX-and-a-concrete-ser.txt, 
> 0004-Allow-multithread-compaction-to-be-disabled.txt, 
> 0005-Add-a-harness-to-allow-compaction-tasks-that-need-to-a.txt
>
>
> This ticket overlaps with CASSANDRA-1876 to a degree, but the approaches and 
> reasoning are different enough to open a separate issue.
> The problem with compactions currently is that they compact the set of 
> sstables that existed the moment the compaction started. This means that for 
> longer running compactions (even when running as fast as possible on the 
> hardware), a very large number of new sstables might be created in the 
> meantime. We have observed this proliferation of sstables killing performance 
> during major/high-bucketed compactions.
> One approach would be to pause compactions in upper buckets (containing 
> larger files) when compactions in lower buckets become possible. While this 
> would likely solve the problem with read performance, it does not actually 
> help us perform compaction any faster, which is a reasonable requirement for 
> other situations.
> Instead, we need to be able to perform any compactions that are currently 
> required in parallel, independent of what bucket they might be in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to