On 19 Nov 2014, at 00:43, Robert Coli <rc...@eventbrite.com> wrote: > > @OP : can you repro if you run a major compaction between the deletion and > the tombstone collection?
This happened in production and, AFAIK, for the first time in a system that has been running for 2 years. We have upgraded the Cassandra version last month so there’s that difference, but the upgrade happened before the original delete of this column. I have found more examples of zombie columns like this (aprox 30k columns of a 1.2M total) and they are all in this same row of this CF. I should point out that we have a sister CF where we do similar inserts/deletes, but it uses STCS, and it doesn’t exhibit this problem. I don’t think I can reproduce this easily in a test environment. > > Basically, I am conjecturing that a compaction bug or one of the handful of > "unmask previously deleted data" bugs are resulting in the unmasking of a > non-tombstone row which is sitting in a SStable. > > OP could also support this conjecture by running sstablekeys on other > SSTables on "3rd replica" and determining what masked values there are for > the row prior to deletion. If the data is sitting in an old SStable, this is > suggestive. There are 3 sstables that have this row on the 3rd replica: Disco-NamespaceFile2-ic-5337-Data.db.json - Has the column tombstone Disco-NamespaceFile2-ic-5719-Data.db.json - Has no value for this column Disco-NamespaceFile2-ic-5748-Data.db.json - Has the original value > > One last question for OP would be whether the nodes were restarted during the > time period this bug was observed. An assortment of the "unmask previously > deleted data" bugs come from "dead" sstables in the data directory being > marked "live" on a restart. All the nodes were restarted on 21-23 October, for the upgrade (1.2.16 -> 1.2.19) I mentioned. The delete happened after. I should also point out that we were experiencing problems related to CASSANDRA-4206 and CASSANDRA-7808. ERROR 15:01:51,885 Exception in thread Thread[CompactionExecutor:15172,1,main] java.lang.AssertionError: originally calculated column size of 78041151 but now it is 78041303 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162) at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) We saw that line multiple times in the logs, always for the same row because the 78041151 and 78041303 even though the data seemed fine. Could that row be the one experiencing problems now? Maybe with the upgrade the new Cassandra correctly compacted this row and all hell broke loose? If so, is there a easy way to fix this? Shouldn’t repair also propagate this zombie column to the other nodes? Thank you and best regards, André Cruz