[ 
https://issues.apache.org/jira/browse/CASSANDRA-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6446:
----------------------------------------

    Attachment:     (was: 6446-write-path-v3.txt)

> Faster range tombstones on wide partitions
> ------------------------------------------
>
>                 Key: CASSANDRA-6446
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6446
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Oleg Anastasyev
>            Assignee: Oleg Anastasyev
>             Fix For: 2.1
>
>         Attachments: RangeTombstonesReadOptimization.diff, 
> RangeTombstonesWriteOptimization.diff
>
>
> Having wide CQL rows (~1M in single partition) and after deleting some of 
> them, we found inefficiencies in handling of range tombstones on both write 
> and read paths.
> I attached 2 patches here, one for write path 
> (RangeTombstonesWriteOptimization.diff) and another on read 
> (RangeTombstonesReadOptimization.diff).
> On write path, when you have some CQL rows deletions by primary key, each of 
> deletion is represented by range tombstone. On put of this tombstone to 
> memtable the original code takes all columns from memtable from partition and 
> checks DeletionInfo.isDeleted by brute for loop to decide, should this column 
> stay in memtable or it was deleted by new tombstone. Needless to say, more 
> columns you have on partition the slower deletions you have heating your CPU 
> with brute range tombstones check. 
> The RangeTombstonesWriteOptimization.diff patch for partitions with more than 
> 10000 columns loops by tombstones instead and checks existance of columns for 
> each of them. Also it copies of whole memtable range tombstone list only if 
> there are changes to be made there (original code copies range tombstone list 
> on every write).
> On read path, original code scans whole range tombstone list of a partition 
> to match sstable columns to their range tomstones. The 
> RangeTombstonesReadOptimization.diff patch scans only necessary range of 
> tombstones, according to filter used for read.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to