If you think about space, use Leveled compaction! This won't only allow you
to fill more space, but also will shrink you data much faster in case of
updates. Size compaction can give you 3x-4x more space used than there are
live data. Consider the following (our simplified) scenario:
1) The data is updated weekly
2) Each week a large SSTable is written (say, 300GB) after full update
processing.
3) In 3 weeks you will have 1.2TB of data in 3 large SSTables.
4) Only after 4th week they all will be compacted into one 300GB SSTable.

Leveled compaction've tamed space for us. Note that you should set
sstable_size_in_mb
to reasonably high value (it is 512 for us with ~700GB per node) to prevent
creating a lot of small files.

Best regards, Vitalii Tymchyshyn.

2012/9/20 Hiller, Dean <dean.hil...@nrel.gov>

> While diskspace is cheap, nodes are not that cheap, and usually systems
> have a 1T limit on each node which means we would love to really not add
> more nodes until we hit 70% disk space instead of the normal 50% that we
> have read about due to compaction.
>
> Is there any way to use less disk space during compactions?
> Is there any work being done so that compactions take less space in the
> future meaning we can buy less nodes?
>
> Thanks,
> Dean
>



-- 
Best regards,
 Vitalii Tymchyshyn

Reply via email to