[ 
https://issues.apache.org/jira/browse/CASSANDRA-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109727#comment-13109727
 ] 

Sylvain Lebresne commented on CASSANDRA-3234:
---------------------------------------------

ABSC.addAll does a merge. So if you do cf1.addAll(cf2) and cf1 and cf2 are 
comparable in size, that's likely as good as it gets. However, if cf2 has much 
less columns than cf1, then it's likely that TMBSC.addAll (that adds the 
columns of cf2 one by one) will be faster.

So I guess it all depends on how many sstables we are merging. If we have a lot 
of them, then adding to TMBSC will be a win (my claim on LeveledCompaction was 
misdirected, I was thinking of it as lots of sstables, but that doesn't mean 
we'll merge lots of sstables (unless L0 is behind)). Anyway, it can very well 
be that ABSC is better in the vast majority of cases so I'm fine with using 
just that. 

> LeveledCompaction has several performance problems
> --------------------------------------------------
>
>                 Key: CASSANDRA-3234
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3234
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 1.0.0
>
>         Attachments: 0001-optimize-single-source-case-for-MergeIterator.txt, 
> 0002-add-TrivialOneToOne-optimization.txt, 
> 0003-fix-leveled-BF-size-calculation.txt, 
> 0004-avoid-calling-shouldPurge-unless-necessary.txt, 
> 0005-use-Array-and-Tree-backed-columns-in-compaction-v2.patch, 
> 0005-use-Array-and-Tree-backed-columns-in-compaction.txt
>
>
> Two main problems:
> - BF size calculation doesn't take into account LCS breaking the output apart 
> into "bite sized" sstables, so memory use is much higher than predicted
> - ManyToMany merging is slow.  At least part of this is from running the full 
> reducer machinery against single input sources, which can be optimized away.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to