Matt Jurik created CASSANDRA-6366:
-------------------------------------

             Summary: Corrupt SSTables
                 Key: CASSANDRA-6366
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6366
             Project: Cassandra
          Issue Type: Bug
         Environment: 1.2.10
            Reporter: Matt Jurik


We ran into some corrupt sstables on one of our 8-node clusters running 1.2.10 
(since upgraded to 1.2.11). Initially, we saw one corrupt sstable on a single 
node. After doing a "nodetool scrub" and then a "nodetool -pr repair" for the 
cluster, we were left with 2 nodes reporting 3 corrupt sstables.

All nodes appear healthy; fsck and our raid controllers report no issues. The 
sstables were written out during normal operation; there were no machine 
restarts or failures anywhere near the sstable file timestamps.

Curiously, I figured out how to read all 3 of our corrupt sstables, though I 
have no idea why this works. Additionally, it seems that I'm able to read all 
OnDiskAtoms as specified in the row header, so the data seems intact.

{code}
diff --git 
a/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
index 381fdb9..8fce5f7 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java
@@ -180,6 +180,11 @@ public class SSTableIdentityIterator implements 
Comparable<SSTableIdentityIterat
 
     public boolean hasNext()
     {
+         /*
+         * For each row where corruption is reported, it is the case that we 
read more data from the preceeding row
+         * than specified by dataSize. That is, this iterator will terminate 
with:
+         *     inputWithTracker.getBytesRead() > dataSize
+         */
         return inputWithTracker.getBytesRead() < dataSize;
     }
 
diff --git a/src/java/org/apache/cassandra/io/sstable/SSTableScanner.java 
b/src/java/org/apache/cassandra/io/sstable/SSTableScanner.java
index 1df5842..718324c 100644
--- a/src/java/org/apache/cassandra/io/sstable/SSTableScanner.java
+++ b/src/java/org/apache/cassandra/io/sstable/SSTableScanner.java
@@ -167,8 +167,9 @@ public class SSTableScanner implements ICompactionScanner
         {
             try
             {
-                if (row != null)
-                    dfile.seek(finishedAt);
+                // Magically read corrupt sstables...
+                // if (row != null)
+                //     dfile.seek(finishedAt);
                 assert !dfile.isEOF();
 
                 // Read data header
diff --git a/src/java/org/apache/cassandra/tools/SSTableExport.java 
b/src/java/org/apache/cassandra/tools/SSTableExport.java
index 05fe9f6..ed61010 100644
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@ -432,7 +432,7 @@ public class SSTableExport
      */
     public static void export(Descriptor desc, String[] excludes) throws 
IOException
     {
-        export(desc, System.out, excludes);
+        export(desc, new PrintStream("json"), excludes);
     }
 
     /**
{code}

Otherwise, I get a stacktrace such as:

{code}
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
dataSize of 72339146324312065 starting at 80476328 would be larger than file 
/Users/exabytes18/development/yay/corrupt-sstables/corrupt-files3/my_keyspace-my_table-ic-40693-Data.db
 length 109073657
    at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:176)
    at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:84)
    at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:70)
    at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:203)
    at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:157)
    at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:144)
    at org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:391)
    at org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:422)
    at org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:435)
    at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:517)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: java.io.IOException: dataSize of 72339146324312065 starting at 
80476328 would be larger than file 
/Users/exabytes18/development/yay/corrupt-sstables/corrupt-files3/my_keyspace-my_table-ic-40693-Data.db
 length 109073657
    at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:132)
    ... 14 more
{code}

Any help on the matter is appreciated.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to