One of the tables SStable got corrupted on all nodes. But repairs were failing 
for all the tables in the keyspaces.

So I took the cluster down and did an offline sstablescrub and that fixed the 
corrupt table, but had loss of data. I did not get a chance to try the 
consistency_all option. will keep that in mind next time.

Interestingly enough, all the repairs succeeded after this one table was fixed.

I am concerned why this occurred. We could live with loss of data on TEST. but 
what would be done if this was PROD.

Below is the full error message.

WARN  [SharedPool-Worker-1] 2020-05-18 10:22:29,152 
AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-1,5,main]: {}
org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
/opt/app/dir1/dir2/data/keypace/table1-f21d1180f5c211e58c9c31653d0c0f4e/mb-2334-big-Data.db
        at 
org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.hasNext(AbstractSSTableIterator.java:379)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.columniterator.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:241)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:94)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:26)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:279)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:112) 
~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:133)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:89)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:294)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:127)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:123)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) 
~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:292) 
~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:50)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-3.7.jar:3.7]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_202]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
 [apache-cassandra-3.7.jar:3.7]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.7.jar:3.7]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_202]
Caused by: java.io.IOException: Corrupt flags value for clustering prefix 
(isStatic flag set): 128
        at 
org.apache.cassandra.db.ClusteringPrefix$Deserializer.prepare(ClusteringPrefix.java:453)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.UnfilteredDeserializer$CurrentDeserializer.prepareNext(UnfilteredDeserializer.java:172)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.UnfilteredDeserializer$CurrentDeserializer.hasNext(UnfilteredDeserializer.java:153)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.computeNext(SSTableIterator.java:124)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.hasNextInternal(SSTableIterator.java:151)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.hasNext(AbstractSSTableIterator.java:366)
 ~[apache-cassandra-3.7.jar:3.7]
        ... 29 common frames omitted



Caused by: org.apache.cassandra.exceptions.RepairException: [repair 
#c32dc7b2-98a3-11ea-b93d-f345d47facfb on keyspace/table1, 
[(-3266124868560049442,-3251552824316949469], 
(5683133382198630448,5688368998796388173], 
(886449428849207017,899877315370686604], 
(7835850211360099716,7853093351026001765], 
(9181506254204950293,9196770820218105732], 
(7303733912491060251,7310514754976525475], 
(7983100824611509538,8010145149611900244], 
(-4494765488985252308,-4470040367916429026], 
(7344767327624475276,7370404351491350941], 
(-9026859152892296811,-9016742046517309237], 
(-2263067294860217298,-2237628835328202538], 
(-7824423893469802055,-7816247712398815159], 
(6359062182289931431,6364230670801234141], 
(9029939988883257412,9032431332467445858], 
(9127001484708311300,9128308858887344690], 
(2934458282870699739,2944345063648658632], 
(-7108213395727334509,-7094434462847445512], 
(3557197829015128860,3561157952072849179], 
(-2420446024332743681,-2419528875877977116], 
(7312974154140628517,7320148430556679762], 
(160279996743650335,163503492742427536], 
(-8429853287104485862,-8426133242237835299], 
(5668272797600548483,5683133382198630448], 
(5846679225324588588,5848133609140162441], 
(-1646246698436771947,-1634431477099107273], 
(-7916439993124382716,-7912352211961754753], 
(5696331314842134786,5730512583080757171], 
(446283421464271035,451860272008147717], 
(-2568460047854322610,-2556187375120095396], 
(-7801573326186225122,-7800004785817370761], 
(810166861072513220,827571856812752619], 
(-2552509086040841865,-2544243996378095445], 
(4973246240836920555,4977826837452131395], 
(6404255276115495718,6424503693665581540], 
(7236259739011635365,7241043279188802979], 
(-3312100566292414114,-3266124868560049442], 
(3190991659583976414,3200827419377301167], 
(-8426133242237835299,-8425854668912806656]]] Validation failed in 
zlt23376.vci.att.com/135.198.127.89
        at 
org.apache.cassandra.repair.ValidationTask.treesReceived(ValidationTask.java:68)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:183)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:439)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:169)
 ~[apache-cassandra-3.7.jar:3.7]
        at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) 
~[apache-cassandra-3.7.jar:3.7]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_202]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_202]
        ... 3 common frames omitted



________________________________
From: Leena Ghatpande <lghatpa...@hotmail.com>
Sent: Monday, May 18, 2020 12:54 PM
To: cassandra cassandra <user@cassandra.apache.org>
Subject: TEST Cluster corrupt after removenode. how to restore

Running cassandra 3.7
our TEST cluster has 6 nodes, 3 in each data center

replication factor 2 for keyspaces.

we added 1 new node in each data center for testing making it 8 node cluster.

We decided to remove the 2 new nodes from cluster, but instead of decommission, 
the admin just deleted the data folder by mistake.

so we ran a nodetool removenode for the 2 nodes.
ran a cleanup and full repair on all remaining nodes.

But now the whole cluster is corrupt. data returns inconsistent results and we 
are getting corrupt sstable errors

Is there a way to cleanly recover the data? we do not have a old snapshot.

Reply via email to