Why so? What are pluses and minuses? As for me, I am looking for number of files in directory. 700GB/512MB*5(files per SST) = 7000 files, that is OK from my view. 700GB/5MB*5 = 700000 files, that is too much for single directory, too much memory used for SST data, too huge compaction queue (that leads to strange pauses, I suppose because of compactor thinking what to compact next),...
2012/9/23 Aaron Turner <synfina...@gmail.com> > On Sun, Sep 23, 2012 at 8:18 PM, Віталій Тимчишин <tiv...@gmail.com> > wrote: > > If you think about space, use Leveled compaction! This won't only allow > you > > to fill more space, but also will shrink you data much faster in case of > > updates. Size compaction can give you 3x-4x more space used than there > are > > live data. Consider the following (our simplified) scenario: > > 1) The data is updated weekly > > 2) Each week a large SSTable is written (say, 300GB) after full update > > processing. > > 3) In 3 weeks you will have 1.2TB of data in 3 large SSTables. > > 4) Only after 4th week they all will be compacted into one 300GB SSTable. > > > > Leveled compaction've tamed space for us. Note that you should set > > sstable_size_in_mb to reasonably high value (it is 512 for us with ~700GB > > per node) to prevent creating a lot of small files. > > 512MB per sstable? Wow, that's freaking huge. From my conversations > with various developers 5-10MB seems far more reasonable. I guess it > really depends on your usage patterns, but that seems excessive to me- > especially as sstables are promoted. > > -- Best regards, Vitalii Tymchyshyn