I believe the following condition within submitMinorIfNeeded(...) determines whether to continue, so it's not a hard loop.
// if (sstables.size() >= minThreshold) ... On Thu, Jan 6, 2011 at 2:51 AM, shimi <shim...@gmail.com> wrote: > According to the code it make sense. > submitMinorIfNeeded() calls doCompaction() which > calls submitMinorIfNeeded(). > With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always run > compaction. > > Shimi > On Thu, Jan 6, 2011 at 10:26 AM, shimi <shim...@gmail.com> wrote: >> >> >> On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <jbel...@gmail.com> wrote: >>> >>> Pretty sure there's logic in there that says "don't bother compacting >>> a single sstable." >> >> No. You can do it. >> Based on the log I have a feeling that it triggers an infinite compaction >> loop. >> >>> >>> On Wed, Jan 5, 2011 at 2:26 PM, shimi <shim...@gmail.com> wrote: >>> > How does minor compaction is triggered? Is it triggered Only when a new >>> > SStable is added? >>> > >>> > I was wondering if triggering a compaction >>> > with minimumCompactionThreshold >>> > set to 1 would be useful. If this can happen I assume it will do >>> > compaction >>> > on files with similar size and remove deleted rows on the rest. >>> > Shimi >>> > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller >>> > <peter.schul...@infidyne.com> >>> > wrote: >>> >> >>> >> > I don't have a problem with disk space. I have a problem with the >>> >> > data >>> >> > size. >>> >> >>> >> [snip] >>> >> >>> >> > Bottom line is that I want to reduce the number of requests that >>> >> > goes to >>> >> > disk. Since there is enough data that is no longer valid I can do it >>> >> > by >>> >> > reclaiming the space. The only way to do it is by running Major >>> >> > compaction. >>> >> > I can wait and let Cassandra do it for me but then the data size >>> >> > will >>> >> > get >>> >> > even bigger and the response time will be worst. I can do it >>> >> > manually >>> >> > but I >>> >> > prefer it to happen in the background with less impact on the system >>> >> >>> >> Ok - that makes perfect sense then. Sorry for misunderstanding :) >>> >> >>> >> So essentially, for workloads that are teetering on the edge of cache >>> >> warmness and is subject to significant overwrites or removals, it may >>> >> be beneficial to perform much more aggressive background compaction >>> >> even though it might waste lots of CPU, to keep the in-memory working >>> >> set down. >>> >> >>> >> There was talk (I think in the compaction redesign ticket) about >>> >> potentially improving the use of bloom filters such that obsolete data >>> >> in sstables could be eliminated from the read set without >>> >> necessitating actual compaction; that might help address cases like >>> >> these too. >>> >> >>> >> I don't think there's a pre-existing silver bullet in a current >>> >> release; you probably have to live with the need for >>> >> greater-than-theoretically-optimal memory requirements to keep the >>> >> working set in memory. >>> >> >>> >> -- >>> >> / Peter Schuller >>> > >>> > >>> >>> >>> >>> -- >>> Jonathan Ellis >>> Project Chair, Apache Cassandra >>> co-founder of Riptano, the source for professional Cassandra support >>> http://riptano.com >> > >