[ 
https://issues.apache.org/jira/browse/CASSANDRA-16769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370368#comment-17370368
 ] 

Scott Carey commented on CASSANDRA-16769:
-----------------------------------------

Pull request: [https://github.com/apache/cassandra/pull/1087]

 

The PR above is based on the change for 
https://issues.apache.org/jira/browse/CASSANDRA-16767 and 
https://issues.apache.org/jira/browse/CASSANDRA-16768 to avoid merge conflicts. 
 I can change that if need be.

 

Like the other two, I would have preferred to avoid locking all of the 
sstables, but such a change would be a much larger undertaking so it behaves 
the same as normal garbagecollect in that regard – even if only half the tables 
are garbagecollected, they will all be locked for the duration.

I tested with the included unit test, and by manually running nodetool 
garbagecollect on a local cassandra instance (3.11.9) where I replaced the 
cassandra jar with one that I built from the changes.

> Add an option to nodetool garbagecollect that collects only a fraction of the 
> data
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16769
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Scott Carey
>            Assignee: Scott Carey
>            Priority: Normal
>
> nodetool garbagecollect can currently only run across an entire table.
> For a very large table, with many use cases, the most likely tables to be 
> full of 'garbage' are the oldest tables. With both LCS and STCS, the tables 
> with the lowest generation number are, under normal operation, going to have 
> the majority of data that is masked by a tombstone or overwritten.
> In order to make 'nodetool garbagecollect' more useful for such large tables, 
> I propose that we add an option `--oldest-fraction` that takes a floating 
> point value between 0.00 and 1.00, and only runs 'garbagecollect' over the 
> oldest SSTables that cover at least that fraction of data.
> This would mean, for insatnce, that if you ran this with `--oldest-fraction 
> 0.1` every week, that no table would be older than 10 weeks old, and there 
> would exist no data that has been overwritten, TTL'd, or deleted that was 
> originally written more than 10 weeks ago.
> In my use case, the oldest LCS table is about 20 months old if the table 
> operates in steady-state on Cassandra 3.11.x, but only 5% of the data in 
> tables that age has not been overwritten. This breaks some of the performance 
> promise of LCS – if your last level is 50% filled with overwritten data, then 
> your chance of finding data only in that level is significantly less than 
> advertised.
> 'nodetool compact' is extremely expensive, and not conducive to any sort of 
> incremental operation currently. But nodetool garbagecollect run on a 
> fraction of the oldest data would be.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to