I think it's important to highlight a few of practical elements of this work 
that may not be completely clear.

Sharding ranges allows for parallel compactions - taking advantage of 
additional CPUs on the server.  This allows compactions to keep up much easier.

One much more graceful element of UCS is that instead of what was previously 
done with compaction strategies where the server just shuts down when running 
out of space - forcing system administrators to be paranoid about headroom.  
Instead UCS has a target overhead (default 20%).  First since the ranges are 
sharded, it makes it less likely that there will be large sstables that need to 
get compacted to require as much headroom, but  if it detects that there is a 
compaction that will violate the target overhead, it will log that and skip the 
compaction - a much more graceful way of handling it.

One other useful practical element is the leveling that Branimir mentions.  The 
scaling_parameters (previously static_scaling_factors) represent what to do at 
each level.  They represent how and when to compact and go from an extreme of 
aggressive compacting to reduce read amplification to the other extreme of 
leaving sstables alone to reduce write amplification.  While you can set it to 
a single value, that will be the value for all levels.  However, making that 
value different for each level has some benefits.  For example, in lower levels 
you can have it compact more aggressively and then as it levels up, you could 
have it gradually reduce compactions.

Another practical element is that if you want to change settings or if the 
strategy automatically changes settings based on load (I believe that's where 
Branimir is heading with this), the strategy will start using the new settings 
without having to recompact existing data.

Finally to Benedict's point about density, unified compaction even with the 
default parameters has been shown to scale to many TB of data per node while 
keeping latencies low.

I'm excited for this to get incorporated into the project and improved.

> On Mar 17, 2023, at 11:45 AM, Josh McKenzie <jmcken...@apache.org> wrote:
> 
> Could we get a JIRA for this too so we can get some reviewers collaborating 
> on this? Only see Lorina's ticket for documenting it in JIRA atm.
> 
> On Fri, Mar 17, 2023, at 9:53 AM, Branimir Lambov wrote:
>> The prototype of UCS can now be found in this pull request: 
>> https://github.com/apache/cassandra/pull/2228
>> 
>> Its description is given in the included markdown documentation: 
>> https://github.com/blambov/cassandra/blob/UCS-density/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
>> 
>> The latest code includes some new elements compared to the link Henrik 
>> posted, including density levelling, bucketing based solely on overlap, and 
>> output splitting by expected density. It goes a little further than what is 
>> described in the CEP-26 proposal as prototyping showed that we can make the 
>> selection of sstables to compact and the sharding decisions independent of 
>> each other. This makes the strategy more stable and better able to react to 
>> changes in configuration and environment.
>> 
>> Regards,
>> Branimir
>> 
>> On Wed, Dec 21, 2022 at 10:01 AM Benedict <bened...@apache.org 
>> <mailto:bened...@apache.org>> wrote:
>> 
>> I’m personally very excited by this work. Compaction could do with a spring 
>> clean and this feels to formalise things much more cleanly, but density 
>> tiering in particular is something I’ve wanted to incorporate for years now, 
>> as it should significantly improve STCS behaviour (most importantly reducing 
>> read amplification and the amount of disk space required, narrowing the 
>> performance delta to LCS in these important dimensions), and simplifies 
>> re-levelling of LCS, making large streams much less painful.
>> 
>> 
>>> On 21 Dec 2022, at 07:19, Henrik Ingo <henrik.i...@datastax.com 
>>> <mailto:henrik.i...@datastax.com>> wrote:
>>> 
>>> I noticed the CEP doesn't link to this, so it should be worth mentioning 
>>> that the UCS documentation is available here: 
>>> https://github.com/datastax/cassandra/blob/ds-trunk/doc/unified_compaction.md
>>> 
>>> Both of the above seem to do a poor job referencing the literature we've 
>>> been inspired by. I will link to Mark Callaghan's blog on the subject:
>>> 
>>> http://smalldatum.blogspot.com/2018/07/tiered-or-leveled-compaction-why-not.html?m=1
>>>  
>>> <https://urldefense.com/v3/__http://smalldatum.blogspot.com/2018/07/tiered-or-leveled-compaction-why-not.html?m=1__;!!PbtH5S7Ebw!Yl4p4GbDXwIxv3LqE22ZTb7rts5YMhROy-ldQnvjOoWW3wTylErPe4ZGChHuxz1ahebyIrxNMkJYObDTMjgpQnZW$>
>>> 
>>> ...and lazily will also borrow from Mark a post that references a bunch of 
>>> LSM (not just UCS related) academic papers: 
>>> http://smalldatum.blogspot.com/2018/08/name-that-compaction-algorithm.html?m=1
>>>  
>>> <https://urldefense.com/v3/__http://smalldatum.blogspot.com/2018/08/name-that-compaction-algorithm.html?m=1__;!!PbtH5S7Ebw!Yl4p4GbDXwIxv3LqE22ZTb7rts5YMhROy-ldQnvjOoWW3wTylErPe4ZGChHuxz1ahebyIrxNMkJYObDTMhKyBRnd$>
>>> 
>>> Finally, it's perhaps worth mentioning that UCS has been in production in 
>>> our Astra Serverless cloud service since it was launched in March 2021. The 
>>> version described by the CEP therefore already incorporates some 
>>> improvements based on observed production behaviour.
>>> 
>>> Henrik 
>>> 
>>> On Mon, 19 Dec 2022, 15:41 Branimir Lambov, <blam...@apache.org 
>>> <mailto:blam...@apache.org>> wrote:
>>> Hello everyone,
>>> 
>>> I would like to open the discussion on our proposal for a unified 
>>> compaction strategy that aims to solve well-known problems with compaction 
>>> and improve parallelism to permit higher levels of sustained write 
>>> throughput.
>>> 
>>> The proposal is here: 
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-26%3A+Unified+Compaction+Strategy
>>> 
>>> The strategy is based on two main observations:
>>> - that tiered and levelled compaction can be generalized as the same thing 
>>> if one observes that both form exponentially-growing levels based on the 
>>> size of sstables (or non-overlapping sstable runs) and trigger a compaction 
>>> when more than a given number of sstables are present on one level;
>>> - that instead of "size" in the description above we can use "density", 
>>> i.e. the size of an sstable divided by the width of the token range it 
>>> covers, which permits sstables to be split at arbitrary points when the 
>>> output of a compaction is written and still produce a levelled hierarchy.
>>> 
>>> The latter allows us to shard the compaction space into progressively 
>>> higher numbers of shards as data moves to the higher levels of the 
>>> hierarchy, improving parallelism, space requirements and the duration of 
>>> compactions, and the former allows us to cover the existing strategies, as 
>>> well as hybrid mixtures that can prove more efficient for some workloads.
>>> 
>>> Thank you,
>>> Branimir

Reply via email to