[
https://issues.apache.org/jira/browse/CRUNCH-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053202#comment-14053202
]
Gabriel Reid commented on CRUNCH-433:
-------------------------------------
{quote}+1 for the corrected patch, with one request: that BaseAvroTableType be
package-scoped instead of public if at all possible.{quote}
Sounds like a good plan. The reason it's public is to use it specifically in
AvroTableFileSource, but I think it's easy enough to get around that.
{quote}do we need to add a classifier line to the avro-mapred dependencies in
the POM for this stuff to work properly on MR1 vs. MR2?{quote}
I don't think so, but I'm not sure I'm totally following what you mean. The
only new thing being done here from avro-mapred is making use of the
org.apache.avro.hadoop.io.AvroKeyValue class (basically only for schema
creation), so I don't think there's anything that would change there in terms
of needing classifiers (or am I missing something?)
> Add support for reading specific/reflect data from an Avro MR file
> ------------------------------------------------------------------
>
> Key: CRUNCH-433
> URL: https://issues.apache.org/jira/browse/CRUNCH-433
> Project: Crunch
> Issue Type: New Feature
> Reporter: Gabriel Reid
> Assignee: Gabriel Reid
> Attachments: CRUNCH-433.patch
>
>
> An Avro Key/Value file written via raw MapReduce contains records that follow
> the schema generated by the org.apache.avro.hadoop.io.AvroKeyValue class.
> If these files contain specific or reflection-based records, there is
> currently no easy way to read them in as specific or reflection records.
> Using the basic public Crunch APIs, they can only be read as generic records
> (that also contain generic records).
> A method should be added to the Avros class which allows specifying specific
> PTypes to be used for reading the underlying data types within a raw MR
> output file.
> Link to related discussion that inspired this ticket on the user list:
> http://s.apache.org/es
--
This message was sent by Atlassian JIRA
(v6.2#6252)