Re: Repair completes successfully but data is still inconsistent

André Cruz Wed, 19 Nov 2014 03:41:37 -0800

On 19 Nov 2014, at 00:43, Robert Coli <rc...@eventbrite.com> wrote:
> 
> @OP : can you repro if you run a major compaction between the deletion and 
> the tombstone collection?


This happened in production and, AFAIK, for the first time in a system that has 
been running for 2 years. We have upgraded the Cassandra version last month so 
there’s that difference, but the upgrade happened before the original delete of 
this column.

I have found more examples of zombie columns like this (aprox 30k columns of a 
1.2M total) and they are all in this same row of this CF. I should point out 
that we have a sister CF where we do similar inserts/deletes, but it uses STCS, 
and it doesn’t exhibit this problem. 

I don’t think I can reproduce this easily in a test environment.

> 
> Basically, I am conjecturing that a compaction bug or one of the handful of 
> "unmask previously deleted data" bugs are resulting in the unmasking of a 
> non-tombstone row which is sitting in a SStable.
> 
> OP could also support this conjecture by running sstablekeys on other 
> SSTables on "3rd replica" and determining what masked values there are for 
> the row prior to deletion. If the data is sitting in an old SStable, this is 
> suggestive.

There are 3 sstables that have this row on the 3rd replica:

Disco-NamespaceFile2-ic-5337-Data.db.json - Has the column tombstone
Disco-NamespaceFile2-ic-5719-Data.db.json - Has no value for this column
Disco-NamespaceFile2-ic-5748-Data.db.json - Has the original value

> 
> One last question for OP would be whether the nodes were restarted during the 
> time period this bug was observed. An assortment of the "unmask previously 
> deleted data" bugs come from "dead" sstables in the data directory being 
> marked "live" on a restart.

All the nodes were restarted on 21-23 October, for the upgrade (1.2.16 -> 
1.2.19) I mentioned. The delete happened after. I should also point out that we 
were experiencing problems related to CASSANDRA-4206 and CASSANDRA-7808.

ERROR 15:01:51,885 Exception in thread Thread[CompactionExecutor:15172,1,main]
java.lang.AssertionError: originally calculated column size of 78041151 but now 
it is 78041303
        at 
org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135)
        at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160)
        at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
        at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
        at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
        at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

We saw that line multiple times in the logs, always for the same row because 
the 78041151 and 78041303 even though the data seemed fine. Could that row be 
the one experiencing problems now? Maybe with the upgrade the new Cassandra 
correctly compacted this row and all hell broke loose?

If so, is there a easy way to fix this? Shouldn’t repair also propagate this 
zombie column to the other nodes?

Thank you and best regards,
André Cruz

Re: Repair completes successfully but data is still inconsistent

Reply via email to