[ 
https://issues.apache.org/jira/browse/CASSANDRA-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14269434#comment-14269434
 ] 

Sylvain Lebresne commented on CASSANDRA-8546:
---------------------------------------------

bq. Until then though the GapList change could be a minimal impact fix

Maybe, but how minimal can still be somewhat debated. I'll admit that I do feel 
like this is a bit of a edge case (tombstone heavy + reverse queries), and so 
while I'm 100% on board for fixing it eventually, my bar for risking other 
regressions is not terribly high. Typically, I had never heard of "brownies 
collections" and a very quick look at their change log shows that the 2 last 
versions fixed critical bugs in GapList. I mean, I'm sure it's good, but it's 
not entirely risk-free to add in a stable release either.

Please note that I'd have no problem with that patch for 3.0 as a stop-gap 
until we come up with an even better solution, but as it won't be so useful 
there, the question is wether we're fine with committing this to 2.1. 
Personally, I tend to be conservative when in doubt and so I'm leaning towards 
accepting that performance bottleneck for 2.1, waiting for 3.0 to fix it. Not a 
terribly strong opinion however.

> RangeTombstoneList becoming bottleneck on tombstone heavy tasks
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-8546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8546
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: 2.0.11 / 2.1
>            Reporter: Dominic Letz
>            Assignee: Joshua McKenzie
>             Fix For: 2.1.3
>
>         Attachments: cassandra-2.0.11-8546.txt, cassandra-2.1-8546.txt, 
> rangetombstonelist_compaction.png, rangetombstonelist_mutation.png, 
> rangetombstonelist_read.png, tombstone_test.tgz
>
>
> I would like to propose a change of the data structure used in the 
> RangeTombstoneList to store and insert tombstone ranges to something with at 
> least O(log N) insert in the middle and at near O(1) and start AND end. Here 
> is why:
> When having tombstone heavy work-loads the current implementation of 
> RangeTombstoneList becomes a bottleneck with slice queries.
> Scanning the number of tombstones up to the default maximum (100k) can take 
> up to 3 minutes of how addInternal() scales on insertion of middle and start 
> elements.
> The attached test shows that with 50k deletes from both sides of a range.
> INSERT 1...110000
> flush()
> DELETE 1...50000
> DELETE 110000...60000
> While one direction performs ok (~400ms on my notebook):
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp DESC LIMIT 1
> {code}
> The other direction underperforms (~7seconds on my notebook)
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp ASC LIMIT 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to