Even when one table is corrupt your repair will fail. To handle this case
without data loss you could replace the complete node ( safest and sureshot
option)


On 19 May 2020 20:20, "Leena Ghatpande" <lghatpa...@hotmail.com> wrote:

> One of the tables SStable got corrupted on all nodes. But repairs were
> failing for all the tables in the keyspaces.
>
> So I took the cluster down and did an offline sstablescrub and that fixed
> the corrupt table, but had loss of data. I did not get a chance to try the
> consistency_all option. will keep that in mind next time.
>
> Interestingly enough, all the repairs succeeded after this one table was
> fixed.
>
> I am concerned why this occurred. We could live with loss of data on TEST.
> but what would be done if this was PROD.
>
> Below is the full error message.
>
> WARN  [SharedPool-Worker-1] 2020-05-18 10:22:29,152
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread
> Thread[SharedPool-Worker-1,5,main]: {}
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted:
> /opt/app/dir1/dir2/data/keypace/table1-f21d1180f5c211e58c9c31653d0c0f
> 4e/mb-2334-big-Data.db
>         at org.apache.cassandra.db.columniterator.AbstractSSTableIterator$
> Reader.hasNext(AbstractSSTableIterator.java:379)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.columniterator.AbstractSSTableIterator.
> hasNext(AbstractSSTableIterator.java:241) ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRow
> Iterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLower
> Bound.computeNext(UnfilteredRowIteratorWithLowerBound.java:94)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLower
> Bound.computeNext(UnfilteredRowIteratorWithLowerBound.java:26)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.utils.MergeIterator$Candidate.
> advance(MergeIterator.java:374) ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.
> advance(MergeIterator.java:186) ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.
> computeNext(MergeIterator.java:155) ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.UnfilteredRowIterators$
> UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.UnfilteredRowIterators$
> UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:279)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:112)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerialize
> r.serialize(UnfilteredRowIteratorSerializer.java:133)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerialize
> r.serialize(UnfilteredRowIteratorSerializer.java:89)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerialize
> r.serialize(UnfilteredRowIteratorSerializer.java:79)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.partitions.
> UnfilteredPartitionIterators$Serializer.serialize(
> UnfilteredPartitionIterators.java:294) ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:127)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:123)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:292)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(
> ReadCommandVerbHandler.java:50) ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_202]
>         at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorServ
> ice$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorServ
> ice$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
> [apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-3.7.jar:3.7]
>         at java.lang.Thread.run(Thread.java:748) [na:1.8.0_202]
> Caused by: java.io.IOException: Corrupt flags value for clustering prefix
> (isStatic flag set): 128
>         at org.apache.cassandra.db.ClusteringPrefix$Deserializer.
> prepare(ClusteringPrefix.java:453) ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.UnfilteredDeserializer$
> CurrentDeserializer.prepareNext(UnfilteredDeserializer.java:172)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.UnfilteredDeserializer$
> CurrentDeserializer.hasNext(UnfilteredDeserializer.java:153)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.columniterator.
> SSTableIterator$ForwardReader.computeNext(SSTableIterator.java:124)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.columniterator.
> SSTableIterator$ForwardReader.hasNextInternal(SSTableIterator.java:151)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.db.columniterator.AbstractSSTableIterator$
> Reader.hasNext(AbstractSSTableIterator.java:366)
> ~[apache-cassandra-3.7.jar:3.7]
>         ... 29 common frames omitted
>
>
>
> Caused by: org.apache.cassandra.exceptions.RepairException: [repair
> #c32dc7b2-98a3-11ea-b93d-f345d47facfb on keyspace/table1,
> [(-3266124868560049442,-3251552824316949469], 
> (5683133382198630448,5688368998796388173],
> (886449428849207017,899877315370686604], 
> (7835850211360099716,7853093351026001765],
> (9181506254204950293,9196770820218105732], 
> (7303733912491060251,7310514754976525475],
> (7983100824611509538,8010145149611900244], 
> (-4494765488985252308,-4470040367916429026],
> (7344767327624475276,7370404351491350941], 
> (-9026859152892296811,-9016742046517309237],
> (-2263067294860217298,-2237628835328202538], 
> (-7824423893469802055,-7816247712398815159],
> (6359062182289931431,6364230670801234141], 
> (9029939988883257412,9032431332467445858],
> (9127001484708311300,9128308858887344690], 
> (2934458282870699739,2944345063648658632],
> (-7108213395727334509,-7094434462847445512], 
> (3557197829015128860,3561157952072849179],
> (-2420446024332743681,-2419528875877977116], 
> (7312974154140628517,7320148430556679762],
> (160279996743650335,163503492742427536], 
> (-8429853287104485862,-8426133242237835299],
> (5668272797600548483,5683133382198630448], 
> (5846679225324588588,5848133609140162441],
> (-1646246698436771947,-1634431477099107273], 
> (-7916439993124382716,-7912352211961754753],
> (5696331314842134786,5730512583080757171], 
> (446283421464271035,451860272008147717],
> (-2568460047854322610,-2556187375120095396], 
> (-7801573326186225122,-7800004785817370761],
> (810166861072513220,827571856812752619], 
> (-2552509086040841865,-2544243996378095445],
> (4973246240836920555,4977826837452131395], 
> (6404255276115495718,6424503693665581540],
> (7236259739011635365,7241043279188802979], 
> (-3312100566292414114,-3266124868560049442],
> (3190991659583976414,3200827419377301167], 
> (-8426133242237835299,-8425854668912806656]]]
> Validation failed in zlt23376.vci.att.com/135.198.127.89
>         at 
> org.apache.cassandra.repair.ValidationTask.treesReceived(ValidationTask.java:68)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:183)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:439)
> ~[apache-cassandra-3.7.jar:3.7]
>         at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(
> RepairMessageVerbHandler.java:169) ~[apache-cassandra-3.7.jar:3.7]
>         at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> ~[apache-cassandra-3.7.jar:3.7]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_202]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[na:1.8.0_202]
>         ... 3 common frames omitted
>
>
>
> ------------------------------
> *From:* Leena Ghatpande <lghatpa...@hotmail.com>
> *Sent:* Monday, May 18, 2020 12:54 PM
> *To:* cassandra cassandra <user@cassandra.apache.org>
> *Subject:* TEST Cluster corrupt after removenode. how to restore
>
> Running cassandra 3.7
> our TEST cluster has 6 nodes, 3 in each data center
>
> replication factor 2 for keyspaces.
>
> we added 1 new node in each data center for testing making it 8 node
> cluster.
>
> We decided to remove the 2 new nodes from cluster, but instead of
> decommission, the admin just deleted the data folder by mistake.
>
> so we ran a nodetool removenode for the 2 nodes.
> ran a cleanup and full repair on all remaining nodes.
>
> But now the whole cluster is corrupt. data returns inconsistent results
> and we are getting corrupt sstable errors
>
> Is there a way to cleanly recover the data? we do not have a old snapshot.
>

Reply via email to