Re: Size Tiered -> Leveled Compaction

Mike Sat, 16 Feb 2013 09:26:54 -0800

Another piece of information that would be useful is advice on how toproperly set the SSTable size for your usecase. I understand thedefault is 5MB, a lot of examples show the use of 10MB, and I've seencases where people have set is as high as 200MB.


Any information is appreciated,
-Mike


On 2/14/2013 4:10 PM, Michael Theroux wrote:

BTW, when I say "major compaction", I mean running the "nodetoolcompact" command (which does a major compaction for Sized TieredCompaction). I didn't see the distribution of SSTables I expecteduntil I ran that command, in the steps I described below.
-Mike

On Feb 14, 2013, at 3:51 PM, Wei Zhu wrote:
I haven't tried to switch compaction strategy. We started with LCS.
For us, after massive data imports (5000 w/seconds for 6 days), thefirst repair is painful since there is quite some data inconsistency.For 150G nodes, repair brought in about 30 G and created thousands ofpending compactions. It took almost a day to clear those. Just beprepared LCS is really slow in 1.1.X. System performance degradesduring that time since reads could go to more SSTable, we see 20SSTable lookup for one read.. (We tried everything we can andcouldn't speed it up. I think it's single threaded.... and it's notrecommended to turn on multithread compaction. We even tried that, itdidn't help )There is parallel LCS in 1.2 which is supposed toalleviate the pain. Haven't upgraded yet, hope it works:)
http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
Since our cluster is not write intensive, only 100 w/seconds. I don'tsee any pending compactions during regular operation.
One thing worth mentioning is the size of the SSTable, default is 5Mwhich is kind of small for 200G (all in one CF) data set, and we areon SSD. It more than 150K files in one directory. (200G/5M = 40KSSTable and each SSTable creates 4 files on disk) You might want towatch that and decide the SSTable size.
By the way, there is no concept of Major compaction for LCS. Just forfun, you can look at a file called $CFName.json in your datadirectory and it tells you the SSTable distribution among differentlevels.
-Wei

------------------------------------------------------------------------
*From:* Charles Brophy <cbro...@zulily.com <mailto:cbro...@zulily.com>>
*To:* user@cassandra.apache.org <mailto:user@cassandra.apache.org>
*Sent:* Thursday, February 14, 2013 8:29 AM
*Subject:* Re: Size Tiered -> Leveled Compaction
I second these questions: we've been looking into changing some ofour CFs to use leveled compaction as well. If anybody here has thewisdom to answer them it would be of wonderful help.
Thanks
Charles
On Wed, Feb 13, 2013 at 7:50 AM, Mike <mthero...@yahoo.com<mailto:mthero...@yahoo.com>> wrote:
    Hello,

    I'm investigating the transition of some of our column families
    from Size Tiered -> Leveled Compaction.  I believe we have some
    high-read-load column families that would benefit tremendously.

    I've stood up a test DB Node to investigate the transition.  I
    successfully alter the column family, and I immediately noticed a
    large number (1000+) pending compaction tasks become available,
    but no compaction get executed.

    I tried running "nodetool sstableupgrade" on the column family,
    and the compaction tasks don't move.

    I also notice no changes to the size and distribution of the
    existing SSTables.

    I then run a major compaction on the column family.  All pending
    compaction tasks get run, and the SSTables have a distribution
    that I would expect from LeveledCompaction (lots and lots of 10MB
    files).

    Couple of questions:

    1) Is a major compaction required to transition from size-tiered
    to leveled compaction?
    2) Are major compactions as much of a concern for
    LeveledCompaction as their are for Size Tiered?

    All the documentation I found concerning transitioning from Size
    Tiered to Level compaction discuss the alter table cql command,
    but I haven't found too much on what else needs to be done after
    the schema change.

    I did these tests with Cassandra 1.1.9.

    Thanks,
    -Mike

Re: Size Tiered -> Leveled Compaction

Reply via email to