Re: Overwhelming tombstones with LCS

2015-07-12 Thread Roman Tkachenko
Hi Dan,

Thanks for the reply. We're on 2.0.13. In fact, I already solved this
exactly the way you described - by changing compaction strategy to STCS via
JMX and letting compactions collect tombstones.

Roman


On Fri, Jul 10, 2015 at 8:57 PM, Dan Kinder dkin...@turnitin.com wrote:


 On Sun, Jul 5, 2015 at 1:40 PM, Roman Tkachenko ro...@mailgunhq.com
 wrote:

 Hey guys,

 I have a table with RF=3 and LCS. Data model makes use of wide rows. A
 certain query run against this table times out and tracing reveals the
 following error on two out of three nodes:

 *Scanned over 10 tombstones; query aborted (see
 tombstone_failure_threshold)*

 This basically means every request with CL higher than one fails.

 I have two questions:

 * How could it happen that only two out of three nodes have overwhelming
 tombstones? For the third node tracing shows sensible *Read 815 live
 and 837 tombstoned cells* traces.


 One theory: before 2.1.6 compactions on wide rows with lots of tombstones
 could take forever or potentially never finish. What version of Cassandra
 are you on? It may be that you got lucky with one node that has been able
 to keep up but the others haven't been able to.



 * Anything I can do to fix those two nodes? I have already set gc_grace
 to 1 day and tried to make compaction strategy more aggressive
 (unchecked_tombstone_compaction - true, tombstone_threshold - 0.01) to no
 avail - a couple of days have already passed and it still gives the same
 error.


 You probably want major compaction which is coming soon for LCS (
 https://issues.apache.org/jira/browse/CASSANDRA-7272) but not here yet.

 The alternative is, if you have enough time and headroom (this is going to
 do some pretty serious compaction so be careful), alter your table to STCS,
 let it compact into one SSTable, then convert back to LCS. It's pretty
 heavy-handed but as long as your gc_grace is low enough it'll do the job.
 Definitely do NOT do this if you have many tombstones in single wide rows
 and are not 2.1.6



 Thanks!

 Roman




 --
 Dan Kinder
 Senior Software Engineer
 Turnitin – www.turnitin.com
 dkin...@turnitin.com



Re: Overwhelming tombstones with LCS

2015-07-10 Thread Dan Kinder
On Sun, Jul 5, 2015 at 1:40 PM, Roman Tkachenko ro...@mailgunhq.com wrote:

 Hey guys,

 I have a table with RF=3 and LCS. Data model makes use of wide rows. A
 certain query run against this table times out and tracing reveals the
 following error on two out of three nodes:

 *Scanned over 10 tombstones; query aborted (see
 tombstone_failure_threshold)*

 This basically means every request with CL higher than one fails.

 I have two questions:

 * How could it happen that only two out of three nodes have overwhelming
 tombstones? For the third node tracing shows sensible *Read 815 live and
 837 tombstoned cells* traces.


One theory: before 2.1.6 compactions on wide rows with lots of tombstones
could take forever or potentially never finish. What version of Cassandra
are you on? It may be that you got lucky with one node that has been able
to keep up but the others haven't been able to.



 * Anything I can do to fix those two nodes? I have already set gc_grace to
 1 day and tried to make compaction strategy more aggressive
 (unchecked_tombstone_compaction - true, tombstone_threshold - 0.01) to no
 avail - a couple of days have already passed and it still gives the same
 error.


You probably want major compaction which is coming soon for LCS (
https://issues.apache.org/jira/browse/CASSANDRA-7272) but not here yet.

The alternative is, if you have enough time and headroom (this is going to
do some pretty serious compaction so be careful), alter your table to STCS,
let it compact into one SSTable, then convert back to LCS. It's pretty
heavy-handed but as long as your gc_grace is low enough it'll do the job.
Definitely do NOT do this if you have many tombstones in single wide rows
and are not 2.1.6



 Thanks!

 Roman




-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com