[ https://issues.apache.org/jira/browse/CASSANDRA-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sharvanath Pathak updated CASSANDRA-10479: ------------------------------------------ Description: Currently a power loss can potentially require manual intervention to bring Cassandra back up. Essentially, these partially written SStables are considered as corrupt, and we see the following trace quite often on hard reboots: {noformat} INFO [SSTableBatchOpen:1] 2015-09-29 19:24:39,170 SSTableReader.java:478 - Opening /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-13368 (79 bytes) ERROR [SSTableBatchOpen:1] 2015-09-29 19:24:39,177 FileUtils.java:447 - Exiting forcefully due to file system exception on startup, disk failure policy "stop" org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_80] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_80] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] Caused by: java.io.EOFException: null at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_80] at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_80] at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_80] at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) ~[apache-cassandra-2.1.9.jar:2.1.9] ... 14 common frames omitted {noformat} Deleting partially written SStables might be a perfectly valid thing to do (given that the data is present in commitlogs). was: Currently a power loss can potentially require manual intervention to bring Cassandra back up. Essentially, these partially written SStables are considered as corrupt, and we see the following trace quite often on hard reboots: {noformat} INFO [SSTableBatchOpen:1] 2015-09-29 19:24:39,170 SSTableReader.java:478 - Opening /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-13368 (79 bytes) ERROR [SSTableBatchOpen:1] 2015-09-29 19:24:39,177 FileUtils.java:447 - Exiting forcefully due to file system exception on startup, disk failure policy "stop" org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_80] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_80] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] Caused by: java.io.EOFException: null at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_80] at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_80] at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_80] at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) ~[apache-cassandra-2.1.9.jar:2.1.9] ... 14 common frames omitted {noformat} Deleting partially written sstables might be a perfectly valid thing to do (given that the data is present in commitlogs). > Handling partially written sstables on node crashes > --------------------------------------------------- > > Key: CASSANDRA-10479 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10479 > Project: Cassandra > Issue Type: Bug > Reporter: Sharvanath Pathak > > Currently a power loss can potentially require manual intervention to bring > Cassandra back up. Essentially, these partially written SStables are > considered as corrupt, and we see the following trace quite often on hard > reboots: > {noformat} > INFO [SSTableBatchOpen:1] 2015-09-29 19:24:39,170 SSTableReader.java:478 - > Opening > /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-13368 > (79 bytes) > ERROR [SSTableBatchOpen:1] 2015-09-29 19:24:39,177 FileUtils.java:447 - > Exiting forcefully due to file system exception on startup, disk failure > policy "stop" > org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException > at > org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534) > ~[apache-cassandra-2.1.9.jar:2.1.9] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > [na:1.7.0_80] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > [na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_80] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] > Caused by: java.io.EOFException: null > at > java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) > ~[na:1.7.0_80] > at java.io.DataInputStream.readUTF(DataInputStream.java:589) > ~[na:1.7.0_80] > at java.io.DataInputStream.readUTF(DataInputStream.java:564) > ~[na:1.7.0_80] > at > org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) > ~[apache-cassandra-2.1.9.jar:2.1.9] > ... 14 common frames omitted > {noformat} > Deleting partially written SStables might be a perfectly valid thing to do > (given that the data is present in commitlogs). -- This message was sent by Atlassian JIRA (v6.3.4#6332)