Hi Ayub,
The counting of tombstones is sort of involved. The JIRA tickets contain the discussions and where things are at. Several tickets to study. Maybe start with CASSANDRA-14149: https://issues.apache.org/jira/browse/CASSANDRA-14149 and follow the discussions back to previous tickets. Enjoy! From: Ayub M [mailto:hia...@gmail.com] Sent: Saturday, February 23, 2019 4:36 AM To: user@cassandra.apache.org Subject: Re: tombstones threshold warning Thanks Ken, further investigating what I found is the tombstones which I am seeing are from null values in the collection objects. Tombstones are also inserted into when initial collection values are inserted but seems like they are not counted towards threshold warning and do not show up in tracing. These tests I did on 3.11.3 version of cassandra. Record inserted with values in collection objects - I see tombstones but they do not show up in tracing. "partition" : { "key" : [ "e7cd5752-bc0d-4157-a80f-7523add8dbcd" ], "position" : 121 }, "rows" : [ { "type" : "row", "position" : 528, "clustering" : [ "ELVIS" ], "liveness_info" : { "tstamp" : "2019-02-21T21:34:06.574848Z" }, "cells" : [ { "name" : "frozen_race", "value" : {"race_title": "Ronde van Gelderland", "race_date": "2015-04-19 00:00:00.000Z", "race_time": "03:22:23"} }, { "name" : "basics_udt", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:06.574847Z", "local_delete_time" : "2019-02-21T21:34:06Z" } }, { "name" : "basics_udt", "path" : [ "birthday" ], "value" : "1993-06-18 00:00:00.000Z" }, { "name" : "basics_udt", "path" : [ "nationality" ], "value" : "New Zealand" }, { "name" : "events_list", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:06.574847Z", "local_delete_time" : "2019-02-21T21:34:06Z" } }, { "name" : "events_list", "path" : [ "6ab41176-3620-11e9-81ac-87caa6eca935" ], "value" : "list-element1" }, { "name" : "events_list", "path" : [ "6ab41177-3620-11e9-81ac-87caa6eca935" ], "value" : "list-element2" }, { "name" : "frozen_race_list", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:06.574847Z", "local_delete_time" : "2019-02-21T21:34:06Z" } }, { "name" : "frozen_race_list", "path" : [ "6ab41178-3620-11e9-81ac-87caa6eca935" ], "value" : {"race_title": "Rabobank 7-Dorpenomloop Aalburg", "race_date": "2015-05-09 00:00:00.000Z", "race_time": "02:58:33"} }, { "name" : "frozen_race_list", "path" : [ "6ab41179-3620-11e9-81ac-87caa6eca935" ], "value" : {"race_title": "Ronde van Gelderland", "race_date": "2015-04-19 00:00:00.000Z", "race_time": "03:22:23"} }, { "name" : "teams_map", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:06.574847Z", "local_delete_time" : "2019-02-21T21:34:06Z" } }, { "name" : "teams_map", "path" : [ "1" ], "value" : "map-value1" }, { "name" : "teams_map", "path" : [ "2" ], "value" : "map-value2" }, { "name" : "teams_set", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:06.574847Z", "local_delete_time" : "2019-02-21T21:34:06Z" } }, { "name" : "teams_set", "path" : [ "set-element1" ], "value" : "" }, { "name" : "teams_set", "path" : [ "set-element2" ], "value" : "" } cassandra@cqlsh:dev_ticket> select * from collsndudt where id = e7cd5752-bc0d-4157-a80f-7523add8dbcd and lastname = 'ELVIS'; id | lastname | basics_udt | events_list | frozen_race | frozen_race_list | teams_map | teams_set --------------------------------------+----------+-------------------------------------------------------------------------------------------------------+------------------------------------+-----------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------- e7cd5752-bc0d-4157-a80f-7523add8dbcd | ELVIS | {birthday: '1993-06-18 00:00:00.000000+0000', nationality: 'New Zealand', height: null, weight: null} | ['list-element1', 'list-element2'] | {race_title: 'Ronde van Gelderland', race_date: '2015-04-19 00:00:00.000000+0000', race_time: '03:22:23'} | [{race_title: 'Rabobank 7-Dorpenomloop Aalburg', race_date: '2015-05-09 00:00:00.000000+0000', race_time: '02:58:33'}, {race_title: 'Ronde van Gelderland', race_date: '2015-04-19 00:00:00.000000+0000', race_time: '03:22:23'}] | {1: 'map-value1', 2: 'map-value2'} | {'set-element1', 'set-element2'} (1 rows) Tracing session: aa82f2c0-3621-11e9-81ac-87caa6eca935 activity | timestamp | source | source_elapsed | client -------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------+----------- Execute CQL3 query | 2019-02-21 21:43:03.404000 | 10.216.87.180 | 0 | 127.0.0.1 Parsing select * from collsndudt where id = e7cd5752-bc0d-4157-a80f-7523add8dbcd and lastname = 'ELVIS'; [CoreThread-8] | 2019-02-21 21:43:03.404000 | 10.216.87.180 | 60 | 127.0.0.1 Preparing statement [CoreThread-8] | 2019-02-21 21:43:03.404000 | 10.216.87.180 | 182 | 127.00.1 Reading data from [/10.216.87.180] [CoreThread-8] | 2019-02-21 21:43:03.404000 | 10.216.87.180 | 328 | 127.0.0.1 Executing single-partition query on collsndudt [CoreThread-3] | 2019-02-21 21:43:03.404001 | 10.216.87.180 | 401 | 127.0.0.1 Acquiring sstable references [CoreThread-3] | 2019-02-21 21:43:03.404001 | 10216.87.180 | 401 | 127.0.0.1 Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [CoreThread-3] | 2019-02-21 21:43:03.404001 | 10.216.87.180 | 481 | 127.0.0.1 Merged data from memtables and 1 sstables [CoreThread-3] | 2019-02-21 21:43:03.404001 | 10.216.87.180 | 617 | 127.0.0.1 Read 1 live rows and 0 tombstone cells[CoreThread-3] | 2019-02-21 21:43:03.404001 | 10.216.87.180 | 617 | 127.0.0.1 Request complete | 2019-02-21 21:43:03.404741 | 10.216.87.180 | 741 | 127.0.0.1 Here null values were inserted into the collection and they do show up in tracing - "partition" : { "key" : [ "cb07baad-eac8-4f65-b28a-bddc06a0de23" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 76, "clustering" : [ "ADAMS" ], "liveness_info" : { "tstamp" : "2019-02-21T21:34:20.024606Z" }, "cells" : [ { "name" : "frozen_race", "deletion_info" : { "local_delete_time" : "2019-02-21T21:34:20Z" } }, { "name" : "basics_udt", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:20.024605Z", "local_delete_time" : "2019-02-21T21:34:20Z" } }, { "name" : "events_list", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:20.024605Z", "local_delete_time" : "2019-02-21T21:34:20Z" } }, { "name" : "frozen_race_list", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:20.024605Z", "local_delete_time" : "2019-02-21T21:34:20Z" } }, { "name" : "teams_map", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:20.024605Z", "local_delete_time" : "2019-02-21T21:34:20Z" } }, { "name" : "teams_set", "deletion_info" : { "marked_deleted" : "2019-02-21T21:34:20.024605Z", "local_delete_time" : "2019-02-21T21:34:20Z" } } cassandra@cqlsh:dev_ticket> select * from collsndudt where id = cb07baad-eac8-4f65-b28a-bddc06a0de23; id | lastname | basics_udt | events_list | frozen_race | frozen_race_list | teams_map | teams_set --------------------------------------+----------+------------+-------------+-------------+------------------+-----------+----------- cb07baad-eac8-4f65-b28a-bddc06a0de23 | ADAMS | null | null | null | null | null | null Tracing session: 63b75250-3621-11e9-81ac-87caa6eca935 activity | timestamp | source | source_elapsed | client --------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------+----------- Execute CQL3 query | 2019-02-21 21:41:04.629000 | 10.216.87.180 | 0 | 127.0.0.1 Parsing select * from collsndudt where id = cb07baad-eac8-4f65-b28a-bddc06a0de23; [CoreThread-8] | 2019-02-21 21:41:04.629000 | 10.216.87.180 | 118 | 127.0.0.1 Preparing statement [CoreThread-8] | 2019-02-21 21:41:04.629000 | 10.216.87.180 | 177 | 127.0.0.1 Reading data from [/10.216.87.180] [CoreThread-8] | 2019-02-21 21:41:04.629000 | 10.216.87.180 | 318 | 127.0.0.1 Executing single-partition query on collsndudt [CoreThread-6] | 2019-02-21 21:41:04.629001 | 10.216.87.180 | 460 | 127.0.0.1 Acquiring sstable references [CoreThread-6] | 2019-02-21 21:41:04629001 | 10.216.87.180 | 460 | 127.0.0.1 Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [CoreThread-6] | 2019-02-21 21:41:04.630000 | 10.216.87.180 | 543 | 127.0.0.1 Merged data from memtables and 1 sstables [CoreThread-6] | 2019-02-21 21:41:04.630000 | 10.216.87.180 | 611 | 127.0.0.1 Read 1 live rows and 1 tombstone cells [CoreThread-6] | 2019-02-21 21:41:04.630000 | 10.216.87.180 | 611 | 127.0.0.1 Request complete | 2019-02-21 When it reports 1 tombstone cells, does it mean 1 records? Otherwise it read more than one tombstone cell. On Wed, Feb 20, 2019 at 1:30 AM Kenneth Brotman <kenbrot...@yahoo.com.invalid> wrote: There is another good article called Common Problems with Cassandra Tombstones by Alla Babkina at https://opencredocom/blogs/cassandra-tombstones-common-issues/ <https://opencredo.com/blogs/cassandra-tombstones-common-issues/> . It says interesting stuff like: 1. You can get tombstones without deleting anything 2. Inserting null values causes tombstones 3. Inserting values into collection columns results in tombstones even if you never delete a value 4. Expiring Data with TTL results in tombstones (of course) 5. The Invisible Column Ranges Tombstones – Resolved in CASSANDRA-11166 though; I should have said that in the last email too not CASSANDRA-8527. But it shouldn’t be this since you are on 3.11.3. I think number three above answers your questions based on your original post. See the article for the details. It’s really good. Kenneth Brotman From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] Sent: Tuesday, February 19, 2019 10:12 PM To: 'user@cassandra.apache.org' Subject: RE: tombstones threshold warning Hi Ayub, Is everything flushing to SSTables? It has to be somewhere right? So is it in the memtables? Or is it that there are tombstones that are sometimes detected and sometimes not detected as described in the very detailed article on The Last Pickle by Alex Dejanovski called Undetectable tombstones in Apache Cassandra: http://thelastpickle.com/blog/2018/07/05/undetectable-tombstones-in-apache-cassandra.html . I thought that was resolved in 3.11.2 by CASSANDRA-8527; and you are running 3.11.3! Is there still an outstanding issue? Kenneth Brotman From: Ayub M [mailto:hia...@gmail.com] Sent: Saturday, February 16, 2019 9:58 PM To: u...@cassandraapache.org <mailto:user@cassandra.apache.org> Subject: tombstones threshold warning In the logs I see tombstone warning threshold. Read 411 live rows and 1644 tombstone cells for query SELECT * FROM ks.tbl WHERE key = XYZ LIMIT 5000 (see tombstone_warn_threshold) This is Cassandra 3.11.3, I see there are 2 sstables for this table and the partition XYZ exists in only one file. Now I dumped this sstable into json using sstabledump. I extracted the data of only this partition and I see there are only 411 rows in it. And all of them are active/live records, so I do not understand from where these tombstone are coming from? This table has collection columns and there are cell tombstones for the collection columns when they were inserted. Does collection cell tombstones get counted as tombstones cells in the warning displayed? Did a small test to see if collection tombstones are counted as tombstones and it does not seem so. So wondering where are those tombstones coming from in my above query. CREATE TABLE tbl ( col1 text, col2 text, c1 int, col3 map<text, text>, PRIMARY KEY (col1, col2) ) WITH CLUSTERING ORDER BY (col2 ASC) cassandra@cqlsh:dev_test> insert into tbl (col1 , col2 , c1, col3 ) values('3','3',3,{'key':'value'}); cassandra@cqlsh:dev_test> select * from tbl where col1 = '3'; col1 | col2 | c1 | col3 ----------------+----------+----+------------------ 3 | 3 | 3 | {'key': 'value'} (1 rows) Tracing session: 4c2a1894-3151-11e9-838d-29ed5fcf59ee activity | timestamp | source | source_elapsed | client ------------------------------------------------------------------------------------------+----------------------------+---------------+----------------+----------- Execute CQL3 query | 2019-02-15 18:41:25.145000 | 10.216.1.1 | 0 | 127.0.0.1 Parsing select * from tbl where col1 = '3'; [CoreThread-3] | 2019-02-15 18:41:25.145000 | 10.216.1.1 | 177 | 127.0.0.1 Preparing statement [CoreThread-3] | 2019-02-15 18:41:25.145001 | 10.216.1.1 | 295 | 127.0.0.1 Reading data from [/10.216.1.1] [CoreThread-3] | 2019-02-15 18:41:25.146000 | 10.216.1.1 | 491 | 127.0.0.1 Executing single-partition query on tbl [CoreThread-2] | 2019-02-15 18:41:25.146000 | 10.216.1.1 | 770 | 127.0.0.1 Acquiring sstable references [CoreThread-2] | 2019-02-15 18:41:25.146000 | 10.216.1.1 | 897 | 127.0.0.1 Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones [CoreThread-2] | 2019-02-15 18:41:25.146000 | 10.216.1.1 | 1096 | 127.0.01 Merged data from memtables and 1 sstables [CoreThread-2] | 2019-02-15 18:41:25.146000 | 10.216.1.1 | 1235 | 127.0.0.1 Read 1 live rows and 0 tombstone cells [CoreThread-2] | 2019-02-15 18:41:25.146000 | 10.216.1.1 | 1317 | 127.0.0.1 Request complete | 2019-02-15 18:41:25.146529 | 10.216.1.1 | 1529 | 127.0.0.1 [root@localhost tbl-8aaa6bc1315011e991e523330936276b]# sstabledump aa-1-bti-Data.db [ { "partition" : { "key" : [ "3" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 41, "clustering" : [ "3" ], "liveness_info" : { "tstamp" : "2019-02-15T18:36:16.838103Z" }, "cells" : [ { "name" : "c1", "value" : 3 }, { "name" : "col3", "deletion_info" : { "marked_deleted" : "2019-02-15T18:36:16.838102Z", "local_delete_time" : "2019-02-15T18:36:17Z" } }, { "name" : "col3", "path" : [ "key" ], "value" : "value" } ] } ] } -- Regards, Ayub