Vassil Lunchev created CASSANDRA-11997:
------------------------------------------

             Summary: Add a STCS compaction subproperty for DESC order bucketing
                 Key: CASSANDRA-11997
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11997
             Project: Cassandra
          Issue Type: Improvement
          Components: Compaction
            Reporter: Vassil Lunchev


Looking at SizeTieredCompactionStrategy.java -> getBuckets().

This method is the only one using 3 of the 10 subproperties of STCS. It buckets 
the files by sorting them ASC and then grouping them using bucket_high and 
min_sstable_size.

getBuckets() practically doesn't use bucket_low at all. As long as it is 
between 0 and 1, the result doesn't depend on bucket_low. For example:

{code:java}
  public static void main(String[] args) {
    List<Pair<String, Long>> files = new ArrayList<>();
    files.add(new Pair<>("10.1G", 10944793422l));
    files.add(new Pair<>("9.4G", 10056333820l));
    files.add(new Pair<>("8.7G", 9266612562l));
    files.add(new Pair<>("4.0G", 4254518390l));
    files.add(new Pair<>("3.5G", 3729627496l));
    files.add(new Pair<>("2.5G", 2587912419l));
    files.add(new Pair<>("2.2G", 2304124647l));
    files.add(new Pair<>("1.4G", 1485000127l));
    files.add(new Pair<>("1.3G", 1340382610l));
    files.add(new Pair<>("456M", 477906537l));
    files.add(new Pair<>("451M", 472012692l));
    files.add(new Pair<>("53M", 54968524l));
    files.add(new Pair<>("18M", 18447540l));
    List<List<String>> buckets = getBuckets(files, 1.5, 0.5, 50l*1024*1024);
    System.out.println(buckets);
  }
{code}

The result is:
{code}
[[451M, 456M], [8.7G, 9.4G, 10.1G], [53M], [1.3G, 1.4G], [18M], [3.5G, 4.0G], 
[2.2G, 2.5G]]
{code}

You can test it with any value for bucketLow between 0 and 1, the result will 
be the same. And it contains no buckets that can be compacted.

However, if you reverse the initial sorting order to DESC (look at the files 
from largest to smallest) you get a completely different bucketing:

{code:java}
  return p2.right.compareTo(p1.right);
{code} 

{code:txt}
  [[456M, 451M], [4.0G, 3.5G, 2.5G, 2.2G], [10.1G, 9.4G, 8.7G], [53M], [1.4G, 
1.3G], [18M]]
{code}

Now there is a bucket that can be compacted: [4.0G, 3.5G, 2.5G, 2.2G]
After that compaction, there will be one more bucket that can be compacted: 
[10.1G, 9.4G, 8.7G, <new>GB]

The sizes given here are real values, from a production load Cassandra 
deployment. We would like to have an aggressive STCS compaction that compacts 
as soon as reasonably possible. (I know about LCS, let's not include it in this 
ticket). However since the ordering in getBuckets is ASC, we cannot do much 
with configuration parameters. Specifically, using min_threshold = 3 is not 
helping - it all boils down to the ordering.

Probably bucket_high = 2 is an option, but then why does Cassandra offer a 
property that doesn't change anything (with a fixed ASC ordering, bucket_low is 
literally useless)

I would like to have the ability to configure DESC ordering. My suggestion is 
to add a new compaction subproperty for STCS, for example named 
bucket_iteration_order, which has ASC by default for backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to