I'd suggest describing your approach on
https://issues.apache.org/jira/browse/CASSANDRA-1608, and if it's
attractive, porting it to 0.8.  It's too late for us to make deep
changes in 0.6 and probably even 0.7 for the sake of stability.

On Mon, Jan 10, 2011 at 8:00 AM, shimi <shim...@gmail.com> wrote:
> I modified the code to limit the size of the SSTables.
> I will be glad if someone can take a look at it
> https://github.com/Shimi/cassandra/tree/cassandra-0.6
> Shimi
>
> On Fri, Jan 7, 2011 at 2:04 AM, Jonathan Shook <jsh...@gmail.com> wrote:
>>
>> I believe the following condition within submitMinorIfNeeded(...)
>> determines whether to continue, so it's not a hard loop.
>>
>> // if (sstables.size() >= minThreshold) ...
>>
>>
>>
>> On Thu, Jan 6, 2011 at 2:51 AM, shimi <shim...@gmail.com> wrote:
>> > According to the code it make sense.
>> > submitMinorIfNeeded() calls doCompaction() which
>> > calls submitMinorIfNeeded().
>> > With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always
>> > run
>> > compaction.
>> >
>> > Shimi
>> > On Thu, Jan 6, 2011 at 10:26 AM, shimi <shim...@gmail.com> wrote:
>> >>
>> >>
>> >> On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <jbel...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Pretty sure there's logic in there that says "don't bother compacting
>> >>> a single sstable."
>> >>
>> >> No. You can do it.
>> >> Based on the log I have a feeling that it triggers an infinite
>> >> compaction
>> >> loop.
>> >>
>> >>>
>> >>> On Wed, Jan 5, 2011 at 2:26 PM, shimi <shim...@gmail.com> wrote:
>> >>> > How does minor compaction is triggered? Is it triggered Only when a
>> >>> > new
>> >>> > SStable is added?
>> >>> >
>> >>> > I was wondering if triggering a compaction
>> >>> > with minimumCompactionThreshold
>> >>> > set to 1 would be useful. If this can happen I assume it will do
>> >>> > compaction
>> >>> > on files with similar size and remove deleted rows on the rest.
>> >>> > Shimi
>> >>> > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
>> >>> > <peter.schul...@infidyne.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> > I don't have a problem with disk space. I have a problem with the
>> >>> >> > data
>> >>> >> > size.
>> >>> >>
>> >>> >> [snip]
>> >>> >>
>> >>> >> > Bottom line is that I want to reduce the number of requests that
>> >>> >> > goes to
>> >>> >> > disk. Since there is enough data that is no longer valid I can do
>> >>> >> > it
>> >>> >> > by
>> >>> >> > reclaiming the space. The only way to do it is by running Major
>> >>> >> > compaction.
>> >>> >> > I can wait and let Cassandra do it for me but then the data size
>> >>> >> > will
>> >>> >> > get
>> >>> >> > even bigger and the response time will be worst. I can do it
>> >>> >> > manually
>> >>> >> > but I
>> >>> >> > prefer it to happen in the background with less impact on the
>> >>> >> > system
>> >>> >>
>> >>> >> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>> >>> >>
>> >>> >> So essentially, for workloads that are teetering on the edge of
>> >>> >> cache
>> >>> >> warmness and is subject to significant overwrites or removals, it
>> >>> >> may
>> >>> >> be beneficial to perform much more aggressive background compaction
>> >>> >> even though it might waste lots of CPU, to keep the in-memory
>> >>> >> working
>> >>> >> set down.
>> >>> >>
>> >>> >> There was talk (I think in the compaction redesign ticket) about
>> >>> >> potentially improving the use of bloom filters such that obsolete
>> >>> >> data
>> >>> >> in sstables could be eliminated from the read set without
>> >>> >> necessitating actual compaction; that might help address cases like
>> >>> >> these too.
>> >>> >>
>> >>> >> I don't think there's a pre-existing silver bullet in a current
>> >>> >> release; you probably have to live with the need for
>> >>> >> greater-than-theoretically-optimal memory requirements to keep the
>> >>> >> working set in memory.
>> >>> >>
>> >>> >> --
>> >>> >> / Peter Schuller
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Jonathan Ellis
>> >>> Project Chair, Apache Cassandra
>> >>> co-founder of Riptano, the source for professional Cassandra support
>> >>> http://riptano.com
>> >>
>> >
>> >
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to