Re: multiple tables vs. partitions and TTL

Alain RODRIGUEZ Thu, 01 Feb 2018 11:12:07 -0800

Hello Marcus,

We want to store bigger amounts of data (> 30mio rows containing blobs)
>


This should be perfectly fine for a partition. We use to recommend up to
100 MB per partition, which is a soft limit, as some use cases work very
well with bigger partition.

will be deleted depending on the type of data on a monthly base
> Some data would survive for two month only, other data for 3-5 years.
>

Do you need to store all these distinct type of data in the same table? If
not, you could could make a very nice usage of TWCS with fixed TTLs for
each table and expiring SSTable on old buckets for example. A colleague
wrote about this topic and I believe that it could be a good fit in this
case: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html.

which will allow a single delete command, followed by a compaction
>

Be careful with deletes and tombstones in Cassandra, it is sadly not that
straightforward.
- Tombstones should have been replicated to all replicas to prevent
resurrecting data (I heard people calling it 'Zombies' or 'ghosts').
- Cassandra waits for gc_grace_seconds after the data expiration before a
compaction can actually remove the data
- All the data shadowed by the tombstone and the tombstone itself should
all be part of the same compaction to actually be possibly evicted.
- ...

It's a tricky topic. I wrote about it last year as there were a lot of
questions about tombstones:
thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html. I hope
it will be of some use to you.

or alternatively to have multiple tables (one per month when the deletion
> process would just drop the table).
> The logic to retrieve that data is per record, so we know both the
> retention period and the id (uuid) of the addressed record,
> so multiple tables can be handled.
>

I would try to break things using the data workflow (or data type in your
case) rather than using tables for time buckets. TWCS might then be a very
good alternative, probably way easier to handle and to reason about.

Ideally, you want to make sure that no read request can ever touch a
tombstone now, in the design phase, as much as possible. You can for
exemple make the day part of the partition key (not only this, it should be
mixed with the natural identifier to form a composite key and to prevent
all the writes from one entire day to write only to RF nodes typically...).

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com




2018-02-01 8:16 GMT+00:00 Marcus Haarmann <marcus.haarm...@midoco.de>:

> Hi experts,
>
> I have a design issue here:
> We want to store bigger amounts of data (> 30mio rows containing blobs)
> which will be deleted depending on the type
> of data on a monthly base (not in the same order as the data entered the
> system).
> Some data would survive for two month only, other data for 3-5 years.
>
> The choice now is to have one table only with TTL per partition and
> partitions per deletion month (when the data should be deleted)
> which will allow a single delete command, followed by a compaction
> or alternatively to have multiple tables (one per month when the deletion
> process would just drop the table).
> The logic to retrieve that data is per record, so we know both the
> retention period and the id (uuid) of the addressed record,
> so multiple tables can be handled.
>
> Since it would be one table per deletion month, I do not expect more than
> 1000-2000 tables, depending on the
> retention period of the data.
>
> The benefit creating multiple tables would be that there are no tombstones
> while more tables take more memory in the nodes.
> The one table approach would make the compaction process take longer and
> produce more I/O activity because
> the compaction would regenerate multiple tables internally.
>
> Any thoughts on this ?
> We want to use 9 nodes, cassandra 3.11 on Linux, total data amount
> expected ~15-20 TB.
>
> Thank you very much,
>
> Marcus Haarmann
>

Re: multiple tables vs. partitions and TTL

Reply via email to