Andres de la Peña created CASSANDRA-16868:
---------------------------------------------

             Summary: Secondary indexes on primary key columns can miss some 
writes
                 Key: CASSANDRA-16868
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16868
             Project: Cassandra
          Issue Type: Bug
          Components: Feature/2i Index
            Reporter: Andres de la Peña
            Assignee: Andres de la Peña


Secondary indexes on primary key columns can miss some writes. For example, an 
update after a deletion won't create an index entry:
{code:java}
CREATE TABLE t (pk int, ck int, v int, PRIMARY KEY (pk, ck));
CREATE INDEX ON t(ck);
INSERT INTO t(pk, ck, v) VALUES (1, 2, 3); -- creates an index entry (right)
DELETE FROM t WHERE pk = 1 AND ck = 2; -- deletes the previous index entry 
(right)
UPDATE t SET v = 3 WHERE pk = 1 AND ck = 2; -- doesn't create a new index entry 
(wrong)
SELECT * FROM t WHERE ck = 2; -- doesn't find the row (wrong)
{code}
This happens because the update uses the {{LivenssInfo}} of the previously 
deleted row (see 
[here|https://github.com/apache/cassandra/blob/cassandra-3.0.25/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L439]).
 The same happens when updating an expired row:
{code:java}
CREATE TABLE t (pk int, ck int, v int, PRIMARY KEY (pk, ck));
CREATE INDEX ON t(ck);
UPDATE t USING TTL 1 SET v = 3 WHERE pk = 1 AND ck = 2; -- creates a 
non-expiring index entry (right)
-- wait for the expiration of the above row
SELECT * FROM t WHERE ck = 2; -- deletes the index entry (right)
UPDATE t SET v = 3 WHERE pk = 1 AND ck = 2; -- doesn't create an index entry 
(wrong)
SELECT * FROM t WHERE ck = 2; -- doesn't find the row (wrong)
{code}
I think that the fix for this is just using the {{getPrimaryKeyIndexLiveness}} 
in {{updateRow}}, as it's used in {{insertRow}}.

Another related problem is that {{getPrimaryKeyIndexLiveness}} uses [the most 
recent TTL in the columns contained on the indexed row 
fragment|https://github.com/apache/cassandra/blob/cassandra-3.0.25/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L519]
 as the TTL of the index entry, producing an expiring index entry that ignores 
the columns without TTL that are already present in flushed sstables. So we can 
find this other error when setting a TTL over flushed indexed data:
{code:java}
CREATE TABLE t(k1 int, k2 int, v int, PRIMARY KEY ((k1, k2)));
CREATE INDEX idx ON t(k1);
INSERT INTO t (k1, k2, v) VALUES (1, 2, 3);
-- flush
UPDATE t USING TTL 1 SET v=0 WHERE k1=1 AND k2=2; -- creates an index entry 
with TTL (wrong)
-- wait for TTL expiration
SELECT TTL(v) FROM t WHERE k1=1; -- doesn't find the row (wrong)
{code}
The straightforward fix is just ignoring the TTL of the columns for indexes on 
primary key components, so we don't produce expiring index entries in that 
case. The index entries will be eventually deleted during index reads, when we 
are sure that they are not pointing to any live data.
  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to