[ 
https://issues.apache.org/jira/browse/CASSANDRA-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958821#comment-15958821
 ] 

Benjamin Lerer commented on CASSANDRA-13147:
--------------------------------------------

It took me a bit of time to understand why the patches for 2.1 and 2.2 were not 
breaking the following unit test:
{code}
    @Test
    public void testIndexOnRegularColumnWithPartitionWithoutRows() throws 
Throwable
    {
        createTable("CREATE TABLE %s (pk int, c int, s int static, v int, 
PRIMARY KEY(pk, c))");
        createIndex("CREATE INDEX ON %s (v)");
        execute("INSERT INTO %s (pk, c, s, v) VALUES (?, ?, ?, ?)", 1, 1, 9, 1);
        execute("INSERT INTO %s (pk, c, s, v) VALUES (?, ?, ?, ?)", 1, 2, 9, 2);
        execute("INSERT INTO %s (pk, s) VALUES (?, ?)", 2, 9);
        execute("INSERT INTO %s (pk, c, s, v) VALUES (?, ?, ?, ?)", 3, 1, 9, 1);
        flush();
        execute("DELETE FROM %s WHERE pk = ? and c = ?", 3, 1);
        assertRows(execute("SELECT * FROM %s WHERE v = ?", 1),
                   row(1, 1, 9, 1));
    }
{code}

Basically, I would have expected the empty partition to arrive at 
{{SelectStatement:processColumnFamily}}.
This does not happen because the empty partition is removed on the replica side 
[here|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/db/index/composites/CompositesSearcher.java#L284].
This behavior seems wrong as I believe that it can result in some stale data 
being returned.
If one of the node has not yet received the deletion, C* might end up returning 
invalid data, due to that.

The correct solution to that problem is in my opinion to remove rows from the 
index only when they have been garbage collected from the indexed table. Now, I 
guess that it should probably be done in a separate ticket.

Regarding the 2.1 and 2.2 pach, I think that the solution should be similar to 
the 3.0 one. Meaning that  {{usesSecondaryIndexing}} should be replaced by 
{{hasClusteringColumnsRestriction()}}.


> Secondary index query on partition key columns might not return all the rows.
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13147
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13147
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Benjamin Lerer
>            Assignee: Andrés de la Peña
>             Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> A secondary index query on a partition key column will, apparently, not 
> return the empty partitions with static data.
> The following unit test can be used to reproduce the problem.
> {code}
>     public void testIndexOnPartitionKeyWithStaticColumnAndNoRows() throws 
> Throwable
>     {
>         createTable("CREATE TABLE %s (pk1 int, pk2 int, c int, s int static, 
> v int, PRIMARY KEY((pk1, pk2), c))");
>         createIndex("CREATE INDEX ON %s (pk2)");
>         execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", 
> 1, 1, 1, 9, 1);
>         execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", 
> 1, 1, 2, 9, 2);
>         execute("INSERT INTO %s (pk1, pk2, s) VALUES (?, ?, ?)", 2, 1, 9);
>         execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", 
> 3, 1, 1, 9, 1);
>         assertRows(execute("SELECT * FROM %s WHERE pk2 = ?", 1),
>                    row(1, 1, 1, 9, 1),
>                    row(1, 1, 2, 9, 2),
>                    row(2, 1, null, 9, null), <-- is not returned
>                    row(3, 1, 1, 9, 1));
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to