Hi,

we had some hard-disk issues this week, which caused some datafiles to get
corrupt, which was reported by the compaction. My approach to fix this was
to delete the corrupted files and run repair. That sounded easy at first,
but unfortunetaly C* 1.1.11 sometimes does not show which datafile is
causing the exception.

How do you handle such cases? Do you delete the entire CF or do you look up
the compaction-started message and delete the files being involved?

In my opinion the Stacktrace should always show the filename of the file
which could not be read. Does anybody know if there were already changes to
the logging since 1.1.11?
CASSANDRA-2261<https://issues.apache.org/jira/browse/CASSANDRA-2261>does
not seem to have fixed the Exceptionhandling part. Were there perhaps
changes in 1.2 with the new disk-failure handling?

cheers,
Christian

PS: Here are some examples I found in my logs:

*Bad behaviour:*
ERROR [ValidationExecutor:1] 2013-05-29 13:26:09,121
AbstractCassandraDaemon.java (line 132) Exception in thread
Thread[ValidationExecutor:1,1,main]
java.io.IOError: java.io.IOException: FAILED_TO_UNCOMPRESS(5)
        at
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
        at
org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:99)
        at
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
        at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:83)
        at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:68)
        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:118)
        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:101)
        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at
com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:726)
        at
org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:69)
        at
org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:457)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: FAILED_TO_UNCOMPRESS(5)
        at org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:78)
        at org.xerial.snappy.SnappyNative.rawUncompress(Native Method)
        at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:391)
        at
org.apache.cassandra.io.compress.SnappyCompressor.uncompress(SnappyCompressor.java:94)
        at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:90)
        at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:71)
        at
org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
        at
org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
        at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
        at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:114)
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
        at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
        at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
        at
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
        ... 19 more

*Also bad behaviour:*
ERROR [CompactionExecutor:1] 2013-05-29 13:12:58,896
AbstractCassandraDaemon.java (line 132) Exception in thread
Thread[CompactionExecutor:1,1,main]
java.io.IOError: java.io.IOException: java.util.zip.DataFormatException:
incomplete dynamic bit lengths tree
        at
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
        at
org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:99)
        at
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
        at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:83)
        at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:68)
        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:118)
        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:101)
        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at
com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:174)
        at
org.apache.cassandra.db.compaction.CompactionManager$2.runMayThrow(CompactionManager.java:164)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: java.util.zip.DataFormatException:
incomplete dynamic bit lengths tree
        at
org.apache.cassandra.io.compress.DeflateCompressor.uncompress(DeflateCompressor.java:114)
        at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:90)
        at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:71)
        at
org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
        at
org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
        at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
        at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:114)
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
        at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
        at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
        at
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
        ... 20 more
Caused by: java.util.zip.DataFormatException: incomplete dynamic bit
lengths tree
        at java.util.zip.Inflater.inflateBytes(Native Method)
        at java.util.zip.Inflater.inflate(Inflater.java:238)
        at
org.apache.cassandra.io.compress.DeflateCompressor.uncompress(DeflateCompressor.java:110)
        ... 33 more


*This corruption is logging it correctly:*
ERROR [CompactionExecutor:3095] 2013-05-27 08:25:06,777
AbstractCassandraDaemon.java (line 132) Exception in thread
Thread[CompactionExecutor:3095,1,main]
java.io.IOError: org.apache.cassandra.io.compress.CorruptedBlockException:
(/var/lib/cassandra/data/Monitoring/cfLargeData/Monitoring-cfLargeData-hf-21873-Data.db):
corruption detected, chunk at 977039 of length 237504.
        at
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
        at
org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:99)
        at
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
        at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:83)
        at
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:68)
        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:118)
        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:101)
        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at
com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:174)
        at
org.apache.cassandra.db.compaction.CompactionManager$2.runMayThrow(CompactionManager.java:164)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
        at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.cassandra.io.compress.CorruptedBlockException:
(/var/lib/cassandra/data/Monitoring/cfLargeData/Monitoring-cfLargeData-hf-21873-Data.db):
corruption detected, chunk at 977039 of length 237504.
        at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:97)
        at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:71)
        at
org.apache.cassandra.io.util.RandomAccessReader.read(RandomAccessReader.java:302)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:397)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
        at
org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
        at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
        at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:114)
        at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
        at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
        at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
        at
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
        ... 20 more

Reply via email to