> In the first example, I am running compaction at step 7 through nodetool,
Sorry missed that. 

>> insert a couple rows with ttl=5 (again, just a small number)
>> 

ExpiringColumn's are only purged if their TTL has expired AND their absolute 
(node local) expiry time occurred before the current "gcBefore" time. 
This may have explained why the columns were not purged in the first 
compaction. 

Can you try your first steps again. And then for the second set of steps add a 
new row, flush, compact. The expired rows should be removed.    

> I don't have to manually delete empty rows after the columns expire. . 

Rows are automatically purged when all columns are purged. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/10/2012, at 3:05 AM, Stephen Mullins <smull...@thebrighttag.com> wrote:

> Thanks Aaron, my reply is inline below:
> 
> On Tue, Oct 23, 2012 at 2:38 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> Performing these steps results in the rows still being present using 
>> cassandra-cli list. 
> I assume you are saying the row key is listed without any columns. aka a 
> ghost row. 
> Correct. 
> 
>>  What gets really odd is if I add these steps it works
> That's working as designed. 
> 
> gc_grace_seconds does not specify when tombstones must be purged, rather it 
> specifies the minimum duration the tombstone must be stored. It's really 
> saying "if you compact this column X seconds after the delete you can purge 
> the tombstone".
> 
> Minor / automatic compaction will kick in if there are (by default) 4 
> SSTables of the same size. And will only purge tombstones if all fragments of 
> the row exists in the SSTables being compaction. 
> 
> Major / manual compaction compacts all the sstables, and so purges the 
> tombstones IF gc_grace_seconds has expired. 
> 
> In your first example compaction had not run so the tombstones stayed on 
> disk. In the second the major compaction purged expired tombstones. 
> In the first example, I am running compaction at step 7 through nodetool, 
> after gc_grace_seconds has expired. Additionally, if I do not perform the 
> manual delete of the row in the second example, the ghost rows are not 
> cleaned up. I want to know that in our production environment, I don't have 
> to manually delete empty rows after the columns expire. But I can't get an 
> example working to that effect.
> 
> Hope that helps. 
>   
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 23/10/2012, at 2:49 PM, Stephen Mullins <smull...@thebrighttag.com> wrote:
> 
>> Hello, I'm seeing Cassandra behavior that I can't explain, on v1.0.12. I'm 
>> trying to test removing rows after all columns have expired. I've read the 
>> following:
>> http://wiki.apache.org/cassandra/DistributedDeletes
>> http://wiki.apache.org/cassandra/MemtableSSTable
>> https://issues.apache.org/jira/browse/CASSANDRA-2795
>> 
>> And came up with a test to demonstrate the empty row removal that does the 
>> following:
>> create a keyspace
>> create a column family with gc_seconds=10 (arbitrary small number)
>> insert a couple rows with ttl=5 (again, just a small number)
>> use nodetool to flush the column family
>> sleep >10 seconds
>> ensure the columns are removed with cassandra-cli list 
>> use nodetool to compact the keyspace
>> Performing these steps results in the rows still being present using 
>> cassandra-cli list. What gets really odd is if I add these steps it works:
>> sleep 5 seconds
>> use cassandra-cli to del mycf[arow]
>> use nodetool to flush the column family
>> use nodetool to compact the keyspace
>> I don't understand why the first set of steps (1-7) don't work to remove the 
>> empty row, nor do I understand why the explicit row delete somehow makes 
>> this work. I have all this in a script that I could attach if that's 
>> appropriate. Is there something wrong with the steps that I have?
>> 
>> Thanks,
>> Stephen
> 
> 

Reply via email to