[ https://issues.apache.org/jira/browse/CASSANDRA-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484777#comment-13484777 ]
Sylvain Lebresne commented on CASSANDRA-4781: --------------------------------------------- I believe you are right that this is a problem. But I think there is another problem in that computation (that do not only impact small number of keys), namely in the estimation of remaining columns: {noformat} long columns = sstable.getEstimatedColumnCount().percentile(remainingKeysRatio) * remainingKeys; {noformat} I think the use of percentile here is not correct. For instance, say the remaingKeysRatio is very big (say 99%), and say that your rows are such that you have many small rows and a handful (5%) of very big ones. In that case, percentile will give you the number of columns the very big row have (it will give you a number such that 99% of the rows have less than this number of columns), and you'll end up with an estimate of columns that is way off (that is, you could end up with a number of remaining column that is order of magnitude bigger than the total number of columns). I believe we should simply use: {noformat} long columns = sstable.getEstimatedColumnCount().mean() * remainingKeys; {noformat} For the estimated key number, I'm good with going with your solution, but an alternative one would be to use a more conservative estimated key number that would be: {noformat} public long conservativeKeyEstimate() { return indexSummary.getKeys().size() < 2 ? 1 : (indexSummary.getKeys().size() - 1) * DatabaseDescriptor.getIndexInterval(); } {noformat} That advantage being that this would always under-estimate the number of keys, while estimatedKeys() always over-estimate it, which seems a better option here because we don't have a choose a rather random value of minimum samples after which we consider that the over-estimation is "acceptable" in proportion. But all this being said, and while we should definitively fix the things above, they will only make the estimation better, but it still an estimation. So at least in theory, we could always end up in a case where the estimate thinks there is enough droppable tombstones, but in practice all the droppable tombstones are in overlapping ranges. Meaning that I'd suggest skipping the worthDroppingTombstones check for sstables that have been compacted (using the creation time of the file is probably good enough) since less than some time threshold (say maybe gcGrace/4). After all, if I've just been compacted and still have a high ratio of droppable, it's probably that those are in fact not droppable due to overlapping sstables. > Sometimes Cassandra starts compacting system-shema_columns cf repeatedly > until the node is killed > ------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-4781 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4781 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.2.0 beta 1 > Environment: Ubuntu 12.04, single-node Cassandra cluster > Reporter: Aleksey Yeschenko > Assignee: Yuki Morishita > Fix For: 1.2.0 beta 2 > > Attachments: 4781.txt > > > Cassandra starts flushing system-schema_columns cf in a seemingly infinite > loop: > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java > (line 239) Compacted to > [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db,]. > 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: > 18ms. > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java > (line 119) Compacting > [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db')] > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,824 CompactionTask.java > (line 239) Compacted to > [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db,]. > 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.182486MB/s. Time: > 20ms. > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,825 CompactionTask.java > (line 119) Compacting > [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db')] > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java > (line 239) Compacted to > [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db,]. > 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.096045MB/s. Time: > 38ms. > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java > (line 119) Compacting > [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db')] > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java > (line 239) Compacted to > [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db,]. > 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.121657MB/s. Time: > 30ms. > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java > (line 119) Compacting > [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db')] > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java > (line 239) Compacted to > [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db,]. > 3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s. Time: > 18ms. > INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java > (line 119) Compacting > [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db')] > ......... > Don't know what's causing it. Don't know a way to predictably trigger this > behaviour. It just happens sometimes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira