Evan Prothro created CASSANDRA-12582:
----------------------------------------

             Summary: ReadFailure from CorruptSSTableException after static 
column removed
                 Key: CASSANDRA-12582
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12582
             Project: Cassandra
          Issue Type: Bug
         Environment: Cassandra 3.0.8
            Reporter: Evan Prothro
             Fix For: 3.0.9
         Attachments: 309issue.cdl, 309issue_reproduce.sh

We ran into an issue on production where reads began to fail for certain 
queries, depending on the range within the relation for those queries. 
Cassandra system log showed an unhandled {{CorruptSSTableException}} exception.

CQL read failure:
{code}
ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation 
failed - received 0 responses and 1 failures" info={'failures': 1, 
'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
{code}

Cassandra exception:
{code}
WARN  [SharedPool-Worker-2] 2016-08-31 12:49:27,979 
AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
Thread[SharedPool-Worker-2,5,main]: {}
java.lang.RuntimeException: 
org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
/usr/local/apache-cassandra-3.0.8/data/data/issue309/apples_by_tree-006748a06fa311e6a7f8ef8b642e977b/mb-1-big-Data.db
  at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_72]
  at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
 [apache-cassandra-3.0.8.jar:3.0.8]
  at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-3.0.8.jar:3.0.8]
  at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
/usr/local/apache-cassandra-3.0.8/data/data/issue309/apples_by_tree-006748a06fa311e6a7f8ef8b642e977b/mb-1-big-Data.db
  at 
org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:343)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:65)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:66)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:62)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:127)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:123)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) 
~[apache-cassandra-3.0.8.jar:3.0.8]
  at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:289) 
~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  ... 5 common frames omitted
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
/usr/local/apache-cassandra-3.0.8/data/data/issue309/apples_by_tree-006748a06fa311e6a7f8ef8b642e977b/mb-1-big-Data.db
  at 
org.apache.cassandra.db.columniterator.AbstractSSTableIterator.<init>(AbstractSSTableIterator.java:130)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.columniterator.SSTableIterator.<init>(SSTableIterator.java:46)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:69)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:338)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  ... 19 common frames omitted
Caused by: java.io.IOException: Corrupt (negative) value length encountered
  at 
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:399) 
~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.rows.BufferCell$Serializer.deserialize(BufferCell.java:302)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:462)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:440)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeStaticRow(UnfilteredSerializer.java:381)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.columniterator.AbstractSSTableIterator.readStaticRow(AbstractSSTableIterator.java:179)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  at 
org.apache.cassandra.db.columniterator.AbstractSSTableIterator.<init>(AbstractSSTableIterator.java:103)
 ~[apache-cassandra-3.0.8.jar:3.0.8]
  ... 22 common frames omitted
{code}

After debugging, it appears that a previously dropped static column (weeks 
prior) was the instigator of the issue. As a workaround we added back the 
column, restarted all cassandra processes within the cluster, and the read 
error and corruption exception went away.

Attached is a script to reproduce with a simple schema.

Also noteworthy (and shown in the script) is that when in this state, 
compaction silently failed (exit 0) to remove the dropped static columns from 
the "corrupted" sstable.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to