[ 
https://issues.apache.org/jira/browse/CASSANDRA-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15561315#comment-15561315
 ] 

Cameron Zemek edited comment on CASSANDRA-12765 at 10/10/16 5:22 AM:
---------------------------------------------------------------------

Traced the issue to

{code:title=CollationController.java|borderStyle=solid}
    private ColumnFamily collectAllData(boolean copyOnHeap)
    {
        // omitted for brevity
                if (!filter.shouldInclude(sstable))
                {
                    nonIntersectingSSTables++;
                    // sstable contains no tombstone if maxLocalDeletionTime == 
Integer.MAX_VALUE, so we can safely skip those entirely
                    if (sstable.getSSTableMetadata().maxLocalDeletionTime != 
Integer.MAX_VALUE)
                    {
                        if (skippedSSTables == null)
                            skippedSSTables = new ArrayList<>();
                        skippedSSTables.add(sstable);
                    }
                    continue;
                }
{code}

The sstable is excluded by the filter because:

{code:title=SliceQueryFilter.java|borderStyle=solid}
    public boolean shouldInclude(SSTableReader sstable)
    {
        List<ByteBuffer> minColumnNames = 
sstable.getSSTableMetadata().minColumnNames;
        List<ByteBuffer> maxColumnNames = 
sstable.getSSTableMetadata().maxColumnNames;
        CellNameType comparator = sstable.metadata.comparator;

        if (minColumnNames.isEmpty() || maxColumnNames.isEmpty())
            return true;

        for (ColumnSlice slice : slices)
            if (slice.intersects(minColumnNames, maxColumnNames, comparator, 
reversed))
                return true;

        return false;
    }
{code}

The other partition (eg. 8772618c9009cf8f5a5e0c19) means minColumnNames and 
maxColumnNames are not empty, and because the cluster key is different (eg. 
test2) it also doesn't intersect. So that means if moves inside the if 
(!filter.shouldInclude(sstable)).

The comment about if maxLocalDeletionTime == Integer.MAX_VALUE means the 
sstable contains no tombstones is wrong. As shown in the steps to reproduce the 
sstable that contains the row level deletion and another partition the metadata 
has maxLocalDeletionTime == Integer.MAX_VALUE because of the live cell.

{code:title=ColumnFamily.java|borderStyle=solid}
    public ColumnStats getColumnStats()
    {
        // omitted for brevity
        for (Cell cell : this)
        {
            minTimestampTracker.update(cell.timestamp());
            maxTimestampTracker.update(cell.timestamp());
            maxDeletionTimeTracker.update(cell.getLocalDeletionTime());
{code}

With the patch the sstable is added to skippedSSTables and therefore gets 
included due to tombstones.

As far as I can tell this issue dates back to 
https://issues.apache.org/jira/browse/CASSANDRA-5514 but I haven't attempted to 
reproduce in any version earlier then 2.1.15 and its been an issue on a cluster 
managing which started on 2.1.13, so I have currently tagged this bug as since 
2.0 beta 1 since that corresponds to #5514


was (Author: cam1982):
Traced the issue to

{code:title=CollationController.java|borderStyle=solid}
    private ColumnFamily collectAllData(boolean copyOnHeap)
    {
        // omitted for brevity
                if (!filter.shouldInclude(sstable))
                {
                    nonIntersectingSSTables++;
                    // sstable contains no tombstone if maxLocalDeletionTime == 
Integer.MAX_VALUE, so we can safely skip those entirely
                    if (sstable.getSSTableMetadata().maxLocalDeletionTime != 
Integer.MAX_VALUE)
                    {
                        if (skippedSSTables == null)
                            skippedSSTables = new ArrayList<>();
                        skippedSSTables.add(sstable);
                    }
                    continue;
                }
{code}

The sstable is excluded by the filter because:

{code:title=SliceQueryFilter.java|borderStyle=solid}
    public boolean shouldInclude(SSTableReader sstable)
    {
        List<ByteBuffer> minColumnNames = 
sstable.getSSTableMetadata().minColumnNames;
        List<ByteBuffer> maxColumnNames = 
sstable.getSSTableMetadata().maxColumnNames;
        CellNameType comparator = sstable.metadata.comparator;

        if (minColumnNames.isEmpty() || maxColumnNames.isEmpty())
            return true;

        for (ColumnSlice slice : slices)
            if (slice.intersects(minColumnNames, maxColumnNames, comparator, 
reversed))
                return true;

        return false;
    }
{code}

The other partition key means minColumnNames and maxColumnNames are not empty, 
and because the cluster key is different (eg. test2) it also doesn't intersect. 
So that means if moves inside the if (!filter.shouldInclude(sstable)).

The comment about if maxLocalDeletionTime == Integer.MAX_VALUE means the 
sstable contains no tombstones is wrong. As shown in the steps to reproduce the 
sstable that contains the row level deletion and another partition the metadata 
has maxLocalDeletionTime == Integer.MAX_VALUE because of the live cell.

{code:title=ColumnFamily.java|borderStyle=solid}
    public ColumnStats getColumnStats()
    {
        // omitted for brevity
        for (Cell cell : this)
        {
            minTimestampTracker.update(cell.timestamp());
            maxTimestampTracker.update(cell.timestamp());
            maxDeletionTimeTracker.update(cell.getLocalDeletionTime());
{code}

With the patch the sstable is added to skippedSSTables and therefore gets 
included due to tombstones.

As far as I can tell this issue dates back to 
https://issues.apache.org/jira/browse/CASSANDRA-5514 but I haven't attempted to 
reproduce in any version earlier then 2.1.15 and its been an issue on a cluster 
managing which started on 2.1.13, so I have currently tagged this bug as since 
2.0 beta 1 since that corresponds to #5514

> SSTable ignored incorrectly with row level tombstone
> ----------------------------------------------------
>
>                 Key: CASSANDRA-12765
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12765
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Cameron Zemek
>         Attachments: 12765.patch
>
>
> {noformat}
> CREATE TABLE test.payload(
>   bucket_id TEXT,
>   name TEXT,
>   data TEXT,
>   PRIMARY KEY (bucket_id, name)
> );
> insert into test.payload (bucket_id, name, data) values 
> ('8772618c9009cf8f5a5e0c18', 'test', 'hello');
> {noformat}
> Flush nodes (nodetool flush)
> {noformat}
> insert into test.payload (bucket_id, name, data) values 
> ('8772618c9009cf8f5a5e0c19', 'test2', 'hello');
> delete from test.payload where bucket_id = '8772618c9009cf8f5a5e0c18';
> {noformat}
> Flush nodes (nodetool flush)
> {noformat}
> select * from test.payload where bucket_id = '8772618c9009cf8f5a5e0c18' and 
> name = 'test';
> {noformat}
> Expected 0 rows but get 1 row back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to