[
https://issues.apache.org/jira/browse/CRUNCH-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabriel Reid updated CRUNCH-433:
--------------------------------
Attachment: CRUNCH-433.patch
Patch that introduces a new Avro PTableType for reading/writing files of
AvroKeyValues, compatible with files created and expected by
org.apache.avro.mapreduce.AvroJob.
Also adds methods in the From class for reading Avro key/value files directly
as a PTable.
> Add support for reading specific/reflect data from an Avro MR file
> ------------------------------------------------------------------
>
> Key: CRUNCH-433
> URL: https://issues.apache.org/jira/browse/CRUNCH-433
> Project: Crunch
> Issue Type: New Feature
> Reporter: Gabriel Reid
> Assignee: Gabriel Reid
> Attachments: CRUNCH-433.patch
>
>
> An Avro Key/Value file written via raw MapReduce contains records that follow
> the schema generated by the org.apache.avro.hadoop.io.AvroKeyValue class.
> If these files contain specific or reflection-based records, there is
> currently no easy way to read them in as specific or reflection records.
> Using the basic public Crunch APIs, they can only be read as generic records
> (that also contain generic records).
> A method should be added to the Avros class which allows specifying specific
> PTypes to be used for reading the underlying data types within a raw MR
> output file.
> Link to related discussion that inspired this ticket on the user list:
> http://s.apache.org/es
--
This message was sent by Atlassian JIRA
(v6.2#6252)