[ 
https://issues.apache.org/jira/browse/CASSANDRA-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-16868:
---------------------------------------
    Reviewers: Benjamin Lerer, Benjamin Lerer  (was: Benjamin Lerer)
               Benjamin Lerer, Benjamin Lerer
       Status: Review In Progress  (was: Patch Available)

> Secondary indexes on primary key columns can miss some writes
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-16868
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16868
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/2i Index
>            Reporter: Andres de la Peña
>            Assignee: Andres de la Peña
>            Priority: Normal
>             Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Secondary indexes on primary key columns can miss some writes. For example, 
> an update after a deletion won't create an index entry:
> {code:java}
> CREATE TABLE t (pk int, ck int, v int, PRIMARY KEY (pk, ck));
> CREATE INDEX ON t(ck);
> INSERT INTO t(pk, ck, v) VALUES (1, 2, 3); -- creates an index entry (right)
> DELETE FROM t WHERE pk = 1 AND ck = 2; -- deletes the previous index entry 
> (right)
> UPDATE t SET v = 3 WHERE pk = 1 AND ck = 2; -- doesn't create a new index 
> entry (wrong)
> SELECT * FROM t WHERE ck = 2; -- doesn't find the row (wrong)
> {code}
> This happens because the update uses the {{LivenssInfo}} of the previously 
> deleted row (see 
> [here|https://github.com/apache/cassandra/blob/cassandra-3.0.25/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L439]).
>  The same happens when updating an expired row:
> {code:java}
> CREATE TABLE t (pk int, ck int, v int, PRIMARY KEY (pk, ck));
> CREATE INDEX ON t(ck);
> UPDATE t USING TTL 1 SET v = 3 WHERE pk = 1 AND ck = 2; -- creates a 
> non-expiring index entry (right)
> -- wait for the expiration of the above row
> SELECT * FROM t WHERE ck = 2; -- deletes the index entry (right)
> UPDATE t SET v = 3 WHERE pk = 1 AND ck = 2; -- doesn't create an index entry 
> (wrong)
> SELECT * FROM t WHERE ck = 2; -- doesn't find the row (wrong)
> {code}
> I think that the fix for this is just using the 
> {{getPrimaryKeyIndexLiveness}} in {{updateRow}}, as it's used in 
> {{insertRow}}.
> Another related problem is that {{getPrimaryKeyIndexLiveness}} uses [the most 
> recent TTL in the columns contained on the indexed row 
> fragment|https://github.com/apache/cassandra/blob/cassandra-3.0.25/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L519]
>  as the TTL of the index entry, producing an expiring index entry that 
> ignores the columns without TTL that are already present in flushed sstables. 
> So we can find this other error when setting a TTL over flushed indexed data:
> {code:java}
> CREATE TABLE t(k1 int, k2 int, v int, PRIMARY KEY ((k1, k2)));
> CREATE INDEX idx ON t(k1);
> INSERT INTO t (k1, k2, v) VALUES (1, 2, 3);
> -- flush
> UPDATE t USING TTL 1 SET v=0 WHERE k1=1 AND k2=2; -- creates an index entry 
> with TTL (wrong)
> -- wait for TTL expiration
> SELECT TTL(v) FROM t WHERE k1=1; -- doesn't find the row (wrong)
> {code}
> The straightforward fix is just ignoring the TTL of the columns for indexes 
> on primary key components, so we don't produce expiring index entries in that 
> case. The index entries will be eventually deleted during index reads, when 
> we are sure that they are not pointing to any live data.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to