[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956701#comment-13956701
 ] 

Benedict commented on CASSANDRA-6477:
-------------------------------------

New suggestion:

Since we're performing read-before-write anyway with this suggestion, why not 
simply perform a _local only_ read-before-write on each of the nodes that owns 
the main record whilst writing the update - instead of issuing a complex 
tombstone, we simply issue a delete for whichever value is older on reconcile.  
Since we always CAS local updates, we will never get missed deletes, however we 
will issue redundant/duplicate deletes (RF many) - but they should be coalesced 
in memtable almost always, so it's a network cost only. There are probably 
tricks we can do to mitigate this cost, though, e.g. having each node 
(deterministically) pick two of the possible owners of the 2i entry to send the 
deletes it encounters to, to minimise replication of effort but also ensure 
message delivery to all nodes.

Result is we keep compaction logic exactly the same, and we retain 
approximately the same consistency guarantees we currently have.

> Partitioned indexes
> -------------------
>
>                 Key: CASSANDRA-6477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>             Fix For: 3.0
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to