Boris,

We hit exactly the same issue, and you are correct the newly created SSTables 
are the cause of why most of the column-tombstone not being purged.

There is an improvement in 1.2 train where both the minimum and maximum 
timestamp for a row is now stored and used during the compaction to determine 
if the portion of the row can be purged.
However, this only appears to help Major compaction has the other restriction 
where all the files encompassing the deleted rows must be part of the 
compaction for the row to be purged still remains.

We have switched to column delete rather that row delete wherever practical. A 
little more work on the app, but a big improvement in reads due to much more 
efficient compaction.

Regards,
Jacques

From: Boris Yen <yulin...@gmail.com<mailto:yulin...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, May 16, 2013 04:07
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>, 
"d...@cassandra.apache.org<mailto:d...@cassandra.apache.org>" 
<d...@cassandra.apache.org<mailto:d...@cassandra.apache.org>>
Subject: Major compaction does not seems to free the disk space a lot if wide 
rows are used.

Hi All,

Sorry for the wide distribution.

Our cassandra is running on 1.0.10. Recently, we are facing a weird situation. 
We have a column family containing wide rows (each row might have a few million 
of columns). We delete the columns on a daily basis and we also run major 
compaction on it everyday to free up disk space (the gc_grace is set to 600 
seconds).

However, every time we run the major compaction, only 1 or 2GB disk space is 
freed. We tried to delete most of the data before running compaction, however, 
the result is pretty much the same.

So, we tried to check the source code. It seems that the column tombstones 
could only be purged when the row key is not in other sstables. I know the 
major compaction should include all sstables, however, in our use case, columns 
get inserted rapidly. This will make the cassandra flush the memtables to disk 
and create new sstables. The newly created sstables will have the same keys as 
the sstables that are being compacted (the compaction will take 2 or 3 hours to 
finish). My question is that will these newly created sstables be the cause of 
why most of the column-tombstone not being purged?

p.s. We also did some other tests. We inserted data to the same CF with the 
same wide-row pattern and deleted most of the data. This time we stopped all 
the writes to cassandra and did the compaction. The disk usage decreased 
dramatically.

Any suggestions or is this a know issue.

Thanks and Regards,
Boris

Reply via email to