[ https://issues.apache.org/jira/browse/AVRO-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799343#comment-13799343 ]
Doug Cutting commented on AVRO-1387: ------------------------------------ If you're willing to write a new block per record, then you might consider just using the Snappy codec, which includes a checksum for each block. Alternately, you could define a meta-codec, that wraps other codecs in checksums, e.g., we might have codecs like deflate+md5 or null+crc32. The point being that we already have a pluggable per-block extension point in codecs, and one of the standard implementations already includes checksums. > Avro container file format update to write checksums for individual record > -------------------------------------------------------------------------- > > Key: AVRO-1387 > URL: https://issues.apache.org/jira/browse/AVRO-1387 > Project: Avro > Issue Type: Bug > Reporter: Hari Shreedharan > > We are considering changes in Flume's file channel to use Avro, one of the > requirements is that each event (which maps to one avro record) be > checksummed so we know if the data is corrupt. > We'd probably have to add a new version for this, since this will change the > data format on disk. I can start working on a Java version if there are no > objections -- This message was sent by Atlassian JIRA (v6.1#6144)