[ 
https://issues.apache.org/jira/browse/CASSANDRA-16769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Carey updated CASSANDRA-16769:
------------------------------------
    Description: 
nodetool garbagecollect can currently only run across an entire table.

For a very large table, with many use cases, the most likely tables to be full 
of 'garbage' are the oldest tables. With both LCS and STCS, the tables with the 
lowest generation number are, under normal operation, going to have the 
majority of data that is masked by a tombstone or overwritten.

In order to make 'nodetool garbagecollect' more useful for such large tables, I 
propose that we add an option `--oldest-fraction` that takes a floating point 
value between 0.00 and 1.00, and only runs 'garbagecollect' over the oldest 
SSTables that cover at least that fraction of data.

This would mean, for insatnce, that if you ran this with `--oldest-fraction 
0.1` every week, that no table would be older than 10 weeks old, and there 
would exist no data that has been overwritten, TTL'd, or deleted that was 
originally written more than 10 weeks ago.

In my use case, the oldest LCS table is about 20 months old if the table 
operates in steady-state on Cassandra 3.11.x, but only 5% of the data in tables 
that age has not been overwritten. This breaks some of the performance promise 
of LCS – if your last level is 50% filled with overwritten data, then your 
chance of finding data only in that level is significantly less than advertised.

'nodetool compact' is extremely expensive, and not conducive to any sort of 
incremental operation currently. But nodetool garbagecollect run on a fraction 
of the oldest data would be.

  was:
nodetool garbagecollect can currently only run across an entire table.

{{For a very large table, with many use cases, the most likely tables to be 
full of 'garbage' are the oldest tables.  With both LCS and STCS, the tables 
with the lowest generation number are, under normal operation, going to have 
the majority of data that is masked by a tombstone or overwritten.}}

{{In order to make 'nodetool garbagecollect' more useful for such large tables, 
I propose that we add an option `--oldest-fraction` that takes a floating point 
value between 0.00 and 1.00, and only runs 'garbagecollect' over the oldest 
SSTables that cover at least that fraction of data.}}

{{This would mean, for insatnce, that if you ran this with `--oldest-fraction 
0.1`  every week, that no table would be older than 10 weeks old, and there 
would exist no data that has been overwritten, TTL'd, or deleted that was 
originally written more than 10 weeks ago.}}

{{In my use case, the oldest LCS table is about 20 months old if the table 
operates in steady-state on Cassandra 3.11.x, but only 5% of the data in tables 
that age has not been overwritten.  This breaks some of the performance promise 
of LCS – if your last level is 50% filled with overwritten data, then your 
chance of finding data _only_ in that level is significantly less than 
advertised.}}

'nodetool compact' is extremely expensive, and not conducive to any sort of 
incremental operation currently.   But nodetool garbagecollect run on a 
fraction of the oldest data would be. 


> Add an option to nodetool garbagecollect that collects only a fraction of the 
> data
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16769
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16769
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Scott Carey
>            Assignee: Scott Carey
>            Priority: Normal
>
> nodetool garbagecollect can currently only run across an entire table.
> For a very large table, with many use cases, the most likely tables to be 
> full of 'garbage' are the oldest tables. With both LCS and STCS, the tables 
> with the lowest generation number are, under normal operation, going to have 
> the majority of data that is masked by a tombstone or overwritten.
> In order to make 'nodetool garbagecollect' more useful for such large tables, 
> I propose that we add an option `--oldest-fraction` that takes a floating 
> point value between 0.00 and 1.00, and only runs 'garbagecollect' over the 
> oldest SSTables that cover at least that fraction of data.
> This would mean, for insatnce, that if you ran this with `--oldest-fraction 
> 0.1` every week, that no table would be older than 10 weeks old, and there 
> would exist no data that has been overwritten, TTL'd, or deleted that was 
> originally written more than 10 weeks ago.
> In my use case, the oldest LCS table is about 20 months old if the table 
> operates in steady-state on Cassandra 3.11.x, but only 5% of the data in 
> tables that age has not been overwritten. This breaks some of the performance 
> promise of LCS – if your last level is 50% filled with overwritten data, then 
> your chance of finding data only in that level is significantly less than 
> advertised.
> 'nodetool compact' is extremely expensive, and not conducive to any sort of 
> incremental operation currently. But nodetool garbagecollect run on a 
> fraction of the oldest data would be.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to