[ 
https://issues.apache.org/jira/browse/AVRO-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799343#comment-13799343
 ] 

Doug Cutting commented on AVRO-1387:
------------------------------------

If you're willing to write a new block per record, then you might consider just 
using the Snappy codec, which includes a checksum for each block.  Alternately, 
you could define a meta-codec, that wraps other codecs in checksums, e.g., we 
might have codecs like deflate+md5 or null+crc32.  The point being that we 
already have a pluggable per-block extension point in codecs, and one of the 
standard implementations already includes checksums.

> Avro container file format update to write checksums for individual record
> --------------------------------------------------------------------------
>
>                 Key: AVRO-1387
>                 URL: https://issues.apache.org/jira/browse/AVRO-1387
>             Project: Avro
>          Issue Type: Bug
>            Reporter: Hari Shreedharan
>
> We are considering changes in Flume's file channel to use Avro, one of the 
> requirements is that each event (which maps to one avro record) be 
> checksummed so we know if the data is corrupt. 
> We'd probably have to add a new version for this, since this will change the 
> data format on disk. I can start working on a Java version if there are no 
> objections



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to