[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079934#comment-13079934
 ] 

Lior Golan edited comment on CASSANDRA-1717 at 8/5/11 12:31 PM:
----------------------------------------------------------------

Seems like in terms of overhead (which based on HADOOP-6148 is potentially very 
significant in both storage and CPU) - block level checksums is much better.

I understand you believe block level checksums are easy in the compressed case 
but not easy in the non-compressed case.

So can't you just implement a no-op compression option that will utilize what 
you're doing / planning to do for compression in terms of block structure and 
block level checksums?
That would be easy if you already designed the compression algorithm to be 
plugable. And if the compression algorithm is not plugable yet - adding that 
would have an obvious side benefit besides having easier implementation of 
block level checksums.   

      was (Author: liorgo2):
    Seems like in terms of overhead (which based on HADOOP-6148 is potentially 
very significant in both storage and CPU) - block level checksums is much 
better.

I understand you believe block level checksums are easy in the compressed case 
but to not easy in the non-compressed case. So can't you just implement a no-op 
compression option that will utilize what you're doing for compression in terms 
of block structure and block level checksums. That would be easy if you already 
designed for the compression algorithm to be plugable. And if the compression 
algorithm is not plugable yet - adding that would have an obvious side benefit 
besides an easier implementation of block level checksums.   
  
> Cassandra cannot detect corrupt-but-readable column data
> --------------------------------------------------------
>
>                 Key: CASSANDRA-1717
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>             Fix For: 1.0
>
>         Attachments: checksums.txt
>
>
> Most corruptions of on-disk data due to bitrot render the column (or row) 
> unreadable, so the data can be replaced by read repair or anti-entropy.  But 
> if the corruption keeps column data readable we do not detect it, and if it 
> corrupts to a higher timestamp value can even resist being overwritten by 
> newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to