Hi Jeff
Thanks a lot for all these details, they are really helpful. My
understanding is that the number of windows is a tradeoff between the
amount of data waiting for expiration and the number of sstables required
to satisfy a read request.

In my case the data model does have a timestamp component. What is your
recommendation for these cases?
* TTL = 21 days, typical read span <= 2 days
* TTL = 1300 days, typical read span 30 to 60 days



śr., 28 wrz 2022 o 16:22 Jeff Jirsa <jji...@gmail.com> napisał(a):

> So when I wrote TWCS, I wrote it for a use case that had 24h TTLs and 30
> days of retention. In that application, we had tested 12h windows, 24h
> windows, and 7 day windows, and eventually settled on 24h windows because
> that balanced factors like sstable size, sstables-per-read, and expired
> data waiting to be dropped (about 3%, 1/30th, on any given day). That's
> where that recommendation came from - it was mostly around how much expired
> data will sit around waiting to be dropped. That doesn't change with
> multiple data directories.
>
> If you go with fewer windows, you'll expire larger chunks at a time, which
> means you'll retain larger chunks waiting on expiration.
> If you go with more windows, you'll potentially touch more sstables on
> read.
>
> Realistically, if you can model your data to align with chunks (so each
> read only touches one window), the actual number of sstables shouldn't
> really matter much - the timestamps and bloom filter will avoid touching
> most of them on the read path anyway. If your data model doesnt have a
> timestamp component to it and you're touching lots of sstables on read,
> even 30 sstables is probably going to hurt you, and 210 would be really,
> really bad.
>
>
>
>
>
> On Wed, Sep 28, 2022 at 7:00 AM Grzegorz Pietrusza <gpietru...@gmail.com>
> wrote:
>
>> Hi All!
>>
>> According to TWCS documentation (
>> https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html)
>> the operator should choose compaction window parameters to select a
>> compaction_window_unit and compaction_window_size pair that produces
>> approximately 20-30 windows.
>>
>> I'm curious where this recommendation comes from? Also should the number
>> of windows be changed when more than one data directory is used? In my
>> example there are 7 data directories (partitions) and it seems that all of
>> them store 20-30 windows. Effectively this gives 140-210 sstables in total.
>> Is that an optimal configuration?
>>
>> Running on Cassandra 3.11
>>
>> Regards
>> Grzegorz
>>
>

Reply via email to