[
https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907280#action_12907280
]
Doug Cutting commented on PIG-794:
----------------------------------
Jeff, please instead use current trunk or the 1.4.0 build that I expect to be
released tomorrow (http://people.apache.org/~cutting/avro-1.4.0-rc4/). There
was a bug that caused a similar failure in the snapshot you're using, but that
should only happen in multi-threaded applications, which I doubt yours is, but
it's better to either test against trunk or a release so we don't chase ghosts.
Further, while debugging a DatumWriter and DatumReader, you might use a
ValidatingEncoder and ValidatingDecoder to ensure that what you write and read
conforms to your schema. You might also test by reading and printing your data
with GenericDatumReader to see that you've written what you meant to write. If
you've written data that does not conform to your declared schema then it
cannot be read correctly. If this is the case, we should attempt to improve
the error message here.
> Use Avro serialization in Pig
> -----------------------------
>
> Key: PIG-794
> URL: https://issues.apache.org/jira/browse/PIG-794
> Project: Pig
> Issue Type: Improvement
> Components: impl
> Affects Versions: 0.2.0
> Reporter: Rakesh Setty
> Assignee: Dmitriy V. Ryaboy
> Attachments: avro-0.1-dev-java_r765402.jar, AvroStorage.patch,
> AvroStorage_2.patch, AvroStorage_3.patch, AvroStorage_4.patch, AvroTest.java,
> jackson-asl-0.9.4.jar, PIG-794.patch
>
>
> We would like to use Avro serialization in Pig to pass data between MR jobs
> instead of the current BinStorage. Attached is an implementation of
> AvroBinStorage which performs significantly better compared to BinStorage on
> our benchmarks.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.