So, in our experience, the amount of storage overhead is much higher. If you
plan on storing 120TB of data, you will want to expect storing 250 TB of
data on disk after the data over head. And then since you have to leave 50%
of storage space free for compaction, you're looking at needing about 500TB
of total storage space.


On Wed, Jun 29, 2011 at 9:17 AM, Ryan King <r...@twitter.com> wrote:

> On Wed, Jun 29, 2011 at 5:36 AM, Jacob, Arun <arun.ja...@disney.com>
> wrote:
> > if I'm planning to store 20TB of new data per week, and expire all data
> > every 2 weeks, with a replication factor of 3, do I only need
> approximately
> > 120 TB of disk? I'm going to use ttl in my column values to automatically
> > expire data. Or would I need more capacity to handle sstable merges?
> Given
> > this amount of data, would you recommend node storage at 2TB per node or
> > more? This application will have a heavy write /moderate read use
> profile.
>
> You'll need extra space for both compaction and the overhead in the
> storage format.
>
> As to the amount of storage per node, that depends on your latency and
> throughput requirements.
>
> -ryan
>

Reply via email to