Re: Reclaim deleted rows space

Jonathan Shook Thu, 06 Jan 2011 16:05:10 -0800

I believe the following condition within submitMinorIfNeeded(...)
determines whether to continue, so it's not a hard loop.


// if (sstables.size() >= minThreshold) ...



On Thu, Jan 6, 2011 at 2:51 AM, shimi <shim...@gmail.com> wrote:
> According to the code it make sense.
> submitMinorIfNeeded() calls doCompaction() which
> calls submitMinorIfNeeded().
> With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always run
> compaction.
>
> Shimi
> On Thu, Jan 6, 2011 at 10:26 AM, shimi <shim...@gmail.com> wrote:
>>
>>
>> On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
>>>
>>> Pretty sure there's logic in there that says "don't bother compacting
>>> a single sstable."
>>
>> No. You can do it.
>> Based on the log I have a feeling that it triggers an infinite compaction
>> loop.
>>
>>>
>>> On Wed, Jan 5, 2011 at 2:26 PM, shimi <shim...@gmail.com> wrote:
>>> > How does minor compaction is triggered? Is it triggered Only when a new
>>> > SStable is added?
>>> >
>>> > I was wondering if triggering a compaction
>>> > with minimumCompactionThreshold
>>> > set to 1 would be useful. If this can happen I assume it will do
>>> > compaction
>>> > on files with similar size and remove deleted rows on the rest.
>>> > Shimi
>>> > On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
>>> > <peter.schul...@infidyne.com>
>>> > wrote:
>>> >>
>>> >> > I don't have a problem with disk space. I have a problem with the
>>> >> > data
>>> >> > size.
>>> >>
>>> >> [snip]
>>> >>
>>> >> > Bottom line is that I want to reduce the number of requests that
>>> >> > goes to
>>> >> > disk. Since there is enough data that is no longer valid I can do it
>>> >> > by
>>> >> > reclaiming the space. The only way to do it is by running Major
>>> >> > compaction.
>>> >> > I can wait and let Cassandra do it for me but then the data size
>>> >> > will
>>> >> > get
>>> >> > even bigger and the response time will be worst. I can do it
>>> >> > manually
>>> >> > but I
>>> >> > prefer it to happen in the background with less impact on the system
>>> >>
>>> >> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>> >>
>>> >> So essentially, for workloads that are teetering on the edge of cache
>>> >> warmness and is subject to significant overwrites or removals, it may
>>> >> be beneficial to perform much more aggressive background compaction
>>> >> even though it might waste lots of CPU, to keep the in-memory working
>>> >> set down.
>>> >>
>>> >> There was talk (I think in the compaction redesign ticket) about
>>> >> potentially improving the use of bloom filters such that obsolete data
>>> >> in sstables could be eliminated from the read set without
>>> >> necessitating actual compaction; that might help address cases like
>>> >> these too.
>>> >>
>>> >> I don't think there's a pre-existing silver bullet in a current
>>> >> release; you probably have to live with the need for
>>> >> greater-than-theoretically-optimal memory requirements to keep the
>>> >> working set in memory.
>>> >>
>>> >> --
>>> >> / Peter Schuller
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>
>
>

Re: Reclaim deleted rows space

Reply via email to