Hi,

As DuyHai said, different TTLs could theoretically be set for different
cells of the same row. And one TTLed cell could be shadowing another cell
that has no TTL (say you forgot to set a TTL and set one afterwards by
performing an update), or vice versa.
One cell could also be missing from a node without Cassandra knowing. So
turning an incomplete row that only has expired cells into a tombstone row
could lead to wrong results being returned at read time : the tombstone row
could potentially shadow a valid live cell from another replica.

Cassandra needs to retain each TTLed cell and send it to replicas during
reads to cover all possible cases.


On Fri, Jan 12, 2018 at 5:28 PM Python_Max <python....@gmail.com> wrote:

> Thank you for response.
>
> I know about the option of setting TTL per column or even per item in
> collection. However in my example entire row has expired, shouldn't
> Cassandra be able to detect this situation and spawn a single tombstone for
> entire row instead of many?
> Is there any reason not doing this except that no one needs it? Is this
> suitable for feature request or improvement?
>
> Thanks.
>
> On Wed, Jan 10, 2018 at 4:52 PM, DuyHai Doan <doanduy...@gmail.com> wrote:
>
>> "The question is why Cassandra creates a tombstone for every column
>> instead of single tombstone per row?"
>>
>> --> Simply because technically it is possible to set different TTL value
>> on each column of a CQL row
>>
>> On Wed, Jan 10, 2018 at 2:59 PM, Python_Max <python....@gmail.com> wrote:
>>
>>> Hello, C* users and experts.
>>>
>>> I have (one more) question about tombstones.
>>>
>>> Consider the following example:
>>> cqlsh> create keyspace test_ttl with replication = {'class':
>>> 'SimpleStrategy', 'replication_factor': '1'}; use test_ttl;
>>> cqlsh> create table items(a text, b text, c1 text, c2 text, c3 text,
>>> primary key (a, b));
>>> cqlsh> insert into items(a,b,c1,c2,c3) values('AAA', 'BBB', 'C111',
>>> 'C222', 'C333') using ttl 60;
>>> bash$ nodetool flush
>>> bash$ sleep 60
>>> bash$ nodetool compact test_ttl items
>>> bash$ sstabledump mc-2-big-Data.db
>>>
>>> [
>>>   {
>>>     "partition" : {
>>>       "key" : [ "AAA" ],
>>>       "position" : 0
>>>     },
>>>     "rows" : [
>>>       {
>>>         "type" : "row",
>>>         "position" : 58,
>>>         "clustering" : [ "BBB" ],
>>>         "liveness_info" : { "tstamp" : "2018-01-10T13:29:25.777Z", "ttl"
>>> : 60, "expires_at" : "2018-01-10T13:30:25Z", "expired" : true },
>>>         "cells" : [
>>>           { "name" : "c1", "deletion_info" : { "local_delete_time" :
>>> "2018-01-10T13:29:25Z" }
>>>           },
>>>           { "name" : "c2", "deletion_info" : { "local_delete_time" :
>>> "2018-01-10T13:29:25Z" }
>>>           },
>>>           { "name" : "c3", "deletion_info" : { "local_delete_time" :
>>> "2018-01-10T13:29:25Z" }
>>>           }
>>>         ]
>>>       }
>>>     ]
>>>   }
>>> ]
>>>
>>> The question is why Cassandra creates a tombstone for every column
>>> instead of single tombstone per row?
>>>
>>> In production environment I have a table with ~30 columns and It gives
>>> me a warning for 30k tombstones and 300 live rows. It is 30 times more then
>>> it could be.
>>> Can this behavior be tuned in some way?
>>>
>>> Thanks.
>>>
>>> --
>>> Best regards,
>>> Python_Max.
>>>
>>
>>
>
>
> --
> Best regards,
> Python_Max.
>


-- 
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Reply via email to