[jira] [Commented] (CASSANDRA-5930) Offline scrubs can choke on broken files

2014-01-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862046#comment-13862046
 ] 

Jonathan Ellis commented on CASSANDRA-5930:
---

It would be nice to be able to tell people how to fix it (realistically: what 
their options are) rather than just sorry, scrub can't help you.  But I'm not 
sure what those options are. :)  /cc [~slebresne] [~iamaleksey]

 Offline scrubs can choke on broken files
 

 Key: CASSANDRA-5930
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5930
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 5930-v1.patch


 There are cases where offline scrub can hit an exception and die, like:
 {noformat}
 WARNING: Non-fatal error reading row (stacktrace follows)
 Exception in thread main java.io.IOError: java.io.IOError: 
 java.io.EOFException
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:242)
   at 
 org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:121)
 Caused by: java.io.IOError: java.io.EOFException
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:182)
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
   ... 1 more
 Caused by: java.io.EOFException
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
   ... 5 more
 {noformat}
 Since the purpose of offline scrub is to fix broken stuff, it should be more 
 resilient to broken stuff...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5930) Offline scrubs can choke on broken files

2014-01-03 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862052#comment-13862052
 ] 

Aleksey Yeschenko commented on CASSANDRA-5930:
--

[~jbellis] There are no options. That said, we should probably allow users to 
override this behavior, if they prefer losing some of the counters history to 
not scrubbing at all.

Also, with CASSANDRA-6504 in this becomes a non-issue (for the newly written 
'global' 2.1 shards, at least - we *can* repair those after the scrub).

 Offline scrubs can choke on broken files
 

 Key: CASSANDRA-5930
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5930
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan
Assignee: Tyler Hobbs
Priority: Minor
 Attachments: 5930-v1.patch


 There are cases where offline scrub can hit an exception and die, like:
 {noformat}
 WARNING: Non-fatal error reading row (stacktrace follows)
 Exception in thread main java.io.IOError: java.io.IOError: 
 java.io.EOFException
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:242)
   at 
 org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:121)
 Caused by: java.io.IOError: java.io.EOFException
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:182)
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
   ... 1 more
 Caused by: java.io.EOFException
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
   ... 5 more
 {noformat}
 Since the purpose of offline scrub is to fix broken stuff, it should be more 
 resilient to broken stuff...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5930) Offline scrubs can choke on broken files

2014-01-02 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860591#comment-13860591
 ] 

Tyler Hobbs commented on CASSANDRA-5930:


[~jeffpotter] what version of Cassandra were you running when you hit the above 
error?

As far as the original stacktrace for this ticket goes, it's unfortunately 
necessary for counter CFs.  CASSANDRA-2759 explains the reasoning.  I suppose I 
could make the error message mention that and point to the ticket.

The scrub code looks reasonably robust in general, so I think it's better to 
wait for individual bugs to get reported than to try to improve the code 
without any failure examples.

 Offline scrubs can choke on broken files
 

 Key: CASSANDRA-5930
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5930
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan
Assignee: Tyler Hobbs
Priority: Minor

 There are cases where offline scrub can hit an exception and die, like:
 {noformat}
 WARNING: Non-fatal error reading row (stacktrace follows)
 Exception in thread main java.io.IOError: java.io.IOError: 
 java.io.EOFException
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:242)
   at 
 org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:121)
 Caused by: java.io.IOError: java.io.EOFException
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:182)
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
   ... 1 more
 Caused by: java.io.EOFException
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
   ... 5 more
 {noformat}
 Since the purpose of offline scrub is to fix broken stuff, it should be more 
 resilient to broken stuff...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5930) Offline scrubs can choke on broken files

2014-01-02 Thread J Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860912#comment-13860912
 ] 

J Potter commented on CASSANDRA-5930:
-

Hi Tyler -- based on my notes, it should have been Cassandra 1.2.6.1 (DSE 3.1), 
at least, that's what other tickets we have filed at this same time suggest.

 Offline scrubs can choke on broken files
 

 Key: CASSANDRA-5930
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5930
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan
Assignee: Tyler Hobbs
Priority: Minor

 There are cases where offline scrub can hit an exception and die, like:
 {noformat}
 WARNING: Non-fatal error reading row (stacktrace follows)
 Exception in thread main java.io.IOError: java.io.IOError: 
 java.io.EOFException
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:242)
   at 
 org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:121)
 Caused by: java.io.IOError: java.io.EOFException
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:182)
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
   ... 1 more
 Caused by: java.io.EOFException
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
   ... 5 more
 {noformat}
 Since the purpose of offline scrub is to fix broken stuff, it should be more 
 resilient to broken stuff...



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5930) Offline scrubs can choke on broken files

2013-09-02 Thread Jeff Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756170#comment-13756170
 ] 

Jeff Potter commented on CASSANDRA-5930:


We're seeing this too -- slightly different stack trace, which I'll include 
here in case it's of use.


WARNING: Non-fatal error reading row (stacktrace follows)
Exception in thread main java.io.IOError: java.lang.IllegalArgumentException
at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:244)
at 
org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:125)
Caused by: java.lang.IllegalArgumentException 
at java.nio.Buffer.limit(Buffer.java:247)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:51)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:60)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:78)
 
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at 
org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:128)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109)
 
at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:219)
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:114)
 
at 
org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:98)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:166)
at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:173) 
... 1 more


 Offline scrubs can choke on broken files
 

 Key: CASSANDRA-5930
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5930
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremiah Jordan
Assignee: Jason Brown
Priority: Minor

 There are cases where offline scrub can hit an exception and die, like:
 {noformat}
 WARNING: Non-fatal error reading row (stacktrace follows)
 Exception in thread main java.io.IOError: java.io.IOError: 
 java.io.EOFException
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:242)
   at 
 org.apache.cassandra.tools.StandaloneScrubber.main(StandaloneScrubber.java:121)
 Caused by: java.io.IOError: java.io.EOFException
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:116)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.init(PrecompactedRow.java:99)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:176)
   at 
 org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:182)
   at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
   ... 1 more
 Caused by: java.io.EOFException
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
   at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377)
   at 
 org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:95)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:401)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
   at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:37)
   at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:144)
   at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
   at 
 org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:112)
   ... 5 more
 {noformat}
 Since the purpose of offline scrub is to fix broken stuff, it should be more 
 resilient to broken stuff...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira