[ 
https://issues.apache.org/jira/browse/CASSANDRA-6563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993861#comment-13993861
 ] 

Paulo Ricardo Motta Gomes commented on CASSANDRA-6563:
------------------------------------------------------

You're right, dropping the checks increases necessary I/O but also unnecessary 
I/O. I made a comparison of the compacted bytes for 2 ranges (with and without 
patch). It's possible to see a huge increase of useful compactions, but also a 
relevant increase of useless compacted bytes (even though the compactions 
haven't stabilized yet).

Subtitle:

* compact+: compacted bytes from useful compactions (compaction ratio < 100%)
* tombstone+: compacted bytes from useful tombstone compactions (compaction 
ratio < 100%)
* compact-: compacted bytes from useless compactions (compaction ratio = 100%)
* tombstone-: compacted bytes from useless compactions (compaction ratio = 100%)

*Unpatched vs Patched compacted bytes* (range 1)

!unpatched1-compacted-bytes.png|align=left, height="50%, width="50%"! 
!patched1-compacted-bytes.png|align=right, height="50%, width="50%"!

*Unpatched vs Patched compacted bytes* (range 2)

!unpatched2-compacted-bytes.png|align=left, height="50%, width="50%"! 
!patched2-compacted-bytes.png|align=right, height="50%, width="50%"!

What we want here is to increase the number of useful tombstone compactions 
without increasing the number of useless ones. I will implement a new patch 
that only takes older sstables into account to check for overlap and compare 
the results with the previous patch to see if we can achieve good results. 
Since most of our tombstone data is time-series TTLed data, I believe that this 
will have a good effect.

> TTL histogram compactions not triggered at high "Estimated droppable 
> tombstones" rate
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-6563
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6563
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 1.2.12ish
>            Reporter: Chris Burroughs
>            Assignee: Paulo Ricardo Motta Gomes
>             Fix For: 1.2.17, 2.0.8
>
>         Attachments: 1.2.16-CASSANDRA-6563.txt, 2.0.7-CASSANDRA-6563.txt, 
> patched-droppadble-ratio.png, patched-storage-load.png, 
> patched1-compacted-bytes.png, patched2-compacted-bytes.png, 
> unpatched-droppable-ratio.png, unpatched-storage-load.png, 
> unpatched1-compacted-bytes.png, unpatched2-compacted-bytes.png
>
>
> I have several column families in a largish cluster where virtually all 
> columns are written with a (usually the same) TTL.  My understanding of 
> CASSANDRA-3442 is that sstables that have a high ( > 20%) estimated 
> percentage of droppable tombstones should be individually compacted.  This 
> does not appear to be occurring with size tired compaction.
> Example from one node:
> {noformat}
> $ ll /data/sstables/data/ks/Cf/*Data.db
> -rw-rw-r-- 31 cassandra cassandra 26651211757 Nov 26 22:59 
> /data/sstables/data/ks/Cf/ks-Cf-ic-295562-Data.db
> -rw-rw-r-- 31 cassandra cassandra  6272641818 Nov 27 02:51 
> /data/sstables/data/ks/Cf/ks-Cf-ic-296121-Data.db
> -rw-rw-r-- 31 cassandra cassandra  1814691996 Dec  4 21:50 
> /data/sstables/data/ks/Cf/ks-Cf-ic-320449-Data.db
> -rw-rw-r-- 30 cassandra cassandra 10909061157 Dec 11 17:31 
> /data/sstables/data/ks/Cf/ks-Cf-ic-340318-Data.db
> -rw-rw-r-- 29 cassandra cassandra   459508942 Dec 12 10:37 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342259-Data.db
> -rw-rw-r--  1 cassandra cassandra      336908 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342307-Data.db
> -rw-rw-r--  1 cassandra cassandra     2063935 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342309-Data.db
> -rw-rw-r--  1 cassandra cassandra         409 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342314-Data.db
> -rw-rw-r--  1 cassandra cassandra    31180007 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342319-Data.db
> -rw-rw-r--  1 cassandra cassandra     2398345 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342322-Data.db
> -rw-rw-r--  1 cassandra cassandra       21095 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342331-Data.db
> -rw-rw-r--  1 cassandra cassandra       81454 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342335-Data.db
> -rw-rw-r--  1 cassandra cassandra     1063718 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342339-Data.db
> -rw-rw-r--  1 cassandra cassandra      127004 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342344-Data.db
> -rw-rw-r--  1 cassandra cassandra      146785 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342346-Data.db
> -rw-rw-r--  1 cassandra cassandra      697338 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342351-Data.db
> -rw-rw-r--  1 cassandra cassandra     3921428 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342367-Data.db
> -rw-rw-r--  1 cassandra cassandra      240332 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342370-Data.db
> -rw-rw-r--  1 cassandra cassandra       45669 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342374-Data.db
> -rw-rw-r--  1 cassandra cassandra    53127549 Dec 12 12:03 
> /data/sstables/data/ks/Cf/ks-Cf-ic-342375-Data.db
> -rw-rw-r-- 16 cassandra cassandra 12466853166 Dec 25 22:40 
> /data/sstables/data/ks/Cf/ks-Cf-ic-396473-Data.db
> -rw-rw-r-- 12 cassandra cassandra  3903237198 Dec 29 19:42 
> /data/sstables/data/ks/Cf/ks-Cf-ic-408926-Data.db
> -rw-rw-r--  7 cassandra cassandra  3692260987 Jan  3 08:25 
> /data/sstables/data/ks/Cf/ks-Cf-ic-427733-Data.db
> -rw-rw-r--  4 cassandra cassandra  3971403602 Jan  6 20:50 
> /data/sstables/data/ks/Cf/ks-Cf-ic-437537-Data.db
> -rw-rw-r--  3 cassandra cassandra  1007832224 Jan  7 15:19 
> /data/sstables/data/ks/Cf/ks-Cf-ic-440331-Data.db
> -rw-rw-r--  2 cassandra cassandra   896132537 Jan  8 11:05 
> /data/sstables/data/ks/Cf/ks-Cf-ic-447740-Data.db
> -rw-rw-r--  1 cassandra cassandra   963039096 Jan  9 04:59 
> /data/sstables/data/ks/Cf/ks-Cf-ic-449425-Data.db
> -rw-rw-r--  1 cassandra cassandra   232168351 Jan  9 10:14 
> /data/sstables/data/ks/Cf/ks-Cf-ic-450287-Data.db
> -rw-rw-r--  1 cassandra cassandra    73126319 Jan  9 11:28 
> /data/sstables/data/ks/Cf/ks-Cf-ic-450307-Data.db
> -rw-rw-r--  1 cassandra cassandra    40921916 Jan  9 12:08 
> /data/sstables/data/ks/Cf/ks-Cf-ic-450336-Data.db
> -rw-rw-r--  1 cassandra cassandra    60881193 Jan  9 12:23 
> /data/sstables/data/ks/Cf/ks-Cf-ic-450341-Data.db
> -rw-rw-r--  1 cassandra cassandra        4746 Jan  9 12:23 
> /data/sstables/data/ks/Cf/ks-Cf-ic-450350-Data.db
> -rw-rw-r--  1 cassandra cassandra        5769 Jan  9 12:23 
> /data/sstables/data/ks/Cf/ks-Cf-ic-450352-Data.db
> {noformat}
> {noformat}
> 295562: Estimated droppable tombstones: 0.899035828535183
> 296121: Estimated droppable tombstones: 0.9135080937806197
> 320449: Estimated droppable tombstones: 0.8916766879896414
> {noformat}
> I've checked in on this example node several times and compactionstats has 
> not shown any other activity that would be blocking the tombstone based 
> compaction.  The TTL is in the 15-20 day range so an sstable from November 
> should have had ample opportunities by January.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to