If you think about space, use Leveled compaction! This won't only allow you to fill more space, but also will shrink you data much faster in case of updates. Size compaction can give you 3x-4x more space used than there are live data. Consider the following (our simplified) scenario: 1) The data is updated weekly 2) Each week a large SSTable is written (say, 300GB) after full update processing. 3) In 3 weeks you will have 1.2TB of data in 3 large SSTables. 4) Only after 4th week they all will be compacted into one 300GB SSTable.
Leveled compaction've tamed space for us. Note that you should set sstable_size_in_mb to reasonably high value (it is 512 for us with ~700GB per node) to prevent creating a lot of small files. Best regards, Vitalii Tymchyshyn. 2012/9/20 Hiller, Dean <dean.hil...@nrel.gov> > While diskspace is cheap, nodes are not that cheap, and usually systems > have a 1T limit on each node which means we would love to really not add > more nodes until we hit 70% disk space instead of the normal 50% that we > have read about due to compaction. > > Is there any way to use less disk space during compactions? > Is there any work being done so that compactions take less space in the > future meaning we can buy less nodes? > > Thanks, > Dean > -- Best regards, Vitalii Tymchyshyn