[ 
https://issues.apache.org/jira/browse/CASSANDRA-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484777#comment-13484777
 ] 

Sylvain Lebresne commented on CASSANDRA-4781:
---------------------------------------------

I believe you are right that this is a problem. But I think there is another 
problem in that computation (that do not only impact small number of keys), 
namely in the estimation of remaining columns:
{noformat}
long columns = sstable.getEstimatedColumnCount().percentile(remainingKeysRatio) 
* remainingKeys;
{noformat}
I think the use of percentile here is not correct. For instance, say the 
remaingKeysRatio is very big (say 99%), and say that your rows are such that 
you have many small rows and a handful (5%) of very big ones. In that case, 
percentile will give you the number of columns the very big row have (it will 
give you a number such that 99% of the rows have less than this number of 
columns), and you'll end up with an estimate of columns that is way off (that 
is, you could end up with a number of remaining column that is order of 
magnitude bigger than the total number of columns). I believe we should simply 
use:
{noformat}
long columns = sstable.getEstimatedColumnCount().mean() * remainingKeys;
{noformat}

For the estimated key number, I'm good with going with your solution, but an 
alternative one would be to use a more conservative estimated key number that 
would be:
{noformat}
public long conservativeKeyEstimate()
{
    return indexSummary.getKeys().size() < 2
         ? 1
         : (indexSummary.getKeys().size() - 1) * 
DatabaseDescriptor.getIndexInterval();
}
{noformat}
That advantage being that this would always under-estimate the number of keys, 
while estimatedKeys() always over-estimate it, which seems a better option here 
because we don't have a choose a rather random value of minimum samples after 
which we consider that the over-estimation is "acceptable" in proportion.

But all this being said, and while we should definitively fix the things above, 
they will only make the estimation better, but it still an estimation. So at 
least in theory, we could always end up in a case where the estimate thinks 
there is enough droppable tombstones, but in practice all the droppable 
tombstones are in overlapping ranges. Meaning that I'd suggest skipping the 
worthDroppingTombstones check for sstables that have been compacted (using the 
creation time of the file is probably good enough) since less than some time 
threshold (say maybe gcGrace/4). After all, if I've just been compacted and 
still have a high ratio of droppable, it's probably that those are in fact not 
droppable due to overlapping sstables.

                
> Sometimes Cassandra starts compacting system-shema_columns cf repeatedly 
> until the node is killed
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4781
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4781
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.0 beta 1
>         Environment: Ubuntu 12.04, single-node Cassandra cluster
>            Reporter: Aleksey Yeschenko
>            Assignee: Yuki Morishita
>             Fix For: 1.2.0 beta 2
>
>         Attachments: 4781.txt
>
>
> Cassandra starts flushing system-schema_columns cf in a seemingly infinite 
> loop:
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java 
> (line 239) Compacted to 
> [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db,].
>   3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s.  Time: 
> 18ms.
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,804 CompactionTask.java 
> (line 119) Compacting 
> [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32107-Data.db')]
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,824 CompactionTask.java 
> (line 239) Compacted to 
> [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db,].
>   3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.182486MB/s.  Time: 
> 20ms.
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,825 CompactionTask.java 
> (line 119) Compacting 
> [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32108-Data.db')]
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java 
> (line 239) Compacted to 
> [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db,].
>   3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.096045MB/s.  Time: 
> 38ms.
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,864 CompactionTask.java 
> (line 119) Compacting 
> [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32109-Data.db')]
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java 
> (line 239) Compacted to 
> [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db,].
>   3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.121657MB/s.  Time: 
> 30ms.
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,894 CompactionTask.java 
> (line 119) Compacting 
> [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32110-Data.db')]
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java 
> (line 239) Compacted to 
> [/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db,].
>   3,827 to 3,827 (~100% of original) bytes for 3 keys at 0.202762MB/s.  Time: 
> 18ms.
>  INFO [CompactionExecutor:7] 2012-10-09 17:55:46,914 CompactionTask.java 
> (line 119) Compacting 
> [SSTableReader(path='/var/lib/cassandra/data/system/schema_columns/system-schema_columns-ia-32111-Data.db')]
> .........
> Don't know what's causing it. Don't know a way to predictably trigger this 
> behaviour. It just happens sometimes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to