[ 
https://issues.apache.org/jira/browse/CASSANDRA-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265568#comment-14265568
 ] 

Dominic Letz commented on CASSANDRA-8546:
-----------------------------------------

This makes all sense. From my reading it also appeared that the way the 
RangeTombstoneList is used in reads vs. in compaction and other places is 
actually so different that it could use  different use case optimized 
data-structures in different places rather than re-using the 
RangeTombstoneList. CASSANDRA-8099 might be an opportunity to change that.

Until then though the GapList change could be a minimal impact fix. It does not 
change the structure of the code and does not regress the read path while 
fixing the reverse read issue and giving linear speedups:
- Read in forward stays as is.
- Read in reverse moves from O(n) to O(1)
- grow() becomes 4x faster since there is only one array to grow
- insertFrom becomes at least 8x faster (only one array to shift instead of 
four , and in worst case only every second insert) 

Obviously I'm scratching here my own itch to improve reverse reads and 
compaction with tombstone heavy workloads - is there any way I can help make 
that happen for 2.1 ?

> RangeTombstoneList becoming bottleneck on tombstone heavy tasks
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-8546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8546
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: 2.0.11 / 2.1
>            Reporter: Dominic Letz
>            Assignee: Benedict
>             Fix For: 2.1.3
>
>         Attachments: cassandra-2.0.11-8546.txt, cassandra-2.1-8546.txt, 
> tombstone_test.tgz
>
>
> I would like to propose a change of the data structure used in the 
> RangeTombstoneList to store and insert tombstone ranges to something with at 
> least O(log N) insert in the middle and at near O(1) and start AND end. Here 
> is why:
> When having tombstone heavy work-loads the current implementation of 
> RangeTombstoneList becomes a bottleneck with slice queries.
> Scanning the number of tombstones up to the default maximum (100k) can take 
> up to 3 minutes of how addInternal() scales on insertion of middle and start 
> elements.
> The attached test shows that with 50k deletes from both sides of a range.
> INSERT 1...110000
> flush()
> DELETE 1...50000
> DELETE 110000...60000
> While one direction performs ok (~400ms on my notebook):
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp DESC LIMIT 1
> {code}
> The other direction underperforms (~7seconds on my notebook)
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp ASC LIMIT 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to