(probably should have read downthread before writing my reply.. briefly, +1
most of the thread's commentary regarding major compaction, but don't
listen to the FUD about major compaction, unless you have a really large
amount of data you'll probably be fine..)

On Fri, Apr 11, 2014 at 7:05 AM, William Oberman
<ober...@civicscience.com>wrote:

> I'm wondering what will clear tombstoned rows?  nodetool cleanup, nodetool
> repair, or time (as in just wait)?
>

The only operation guaranteed to collect 100% of tombstones is major
compaction. gc_grace_seconds duration is also involved, so be sure to
understand its value.


> I had a CF that was more or less storing session information.  After some
> time, we decided that one piece of this information was pointless to track
> (and was 90%+ of the columns, and in 99% of those cases was ALL columns for
> a row).   I wrote a process to remove all of those columns (which again in
> a vast majority of cases had the effect of removing the whole row).
>

https://issues.apache.org/jira/browse/CASSANDRA-1581

Describes a tool which "filtered" sstables to remove rows. In a future case
like this one, you might want to consider this approach.


> It wasn't 100% clear to me what to poke to cause compactions to clear the
> tombstones.
>

In order to delete a tombstone, all fragments of the row must be in a
sstable involved in the current compaction.

Some discussion here : https://issues.apache.org/jira/browse/CASSANDRA-1074


>  First I tried nodetool cleanup on a candidate node.  But, afterwards the
> disk usage was the same.
>

Cleanup writes out sstables 1:1, removing data which no belongs to a range
if the node cleaning up no longer owns that range. It is meant for use when
ranges are split, in order to "clean up" the data from the range being
given up.


>  Then I tried nodetool repair on that same node.  But again, disk usage is
> still the same.  The CF has no snapshots.
>

Repair is unrelated to the purging of tombstones.


> So, am I misunderstanding something?  Is there another operation to try?
>  Do I have to "just wait"?  I've only done cleanup/repair on one node.  Do
> I have to run one or the other over all nodes to clear tombstones?
>

If you are using size tiered compaction, run a major compaction. ("nodetool
compact"). If you aren't, I believe that there is nothing you can do.

=Rob

Reply via email to