[
https://issues.apache.org/jira/browse/AVRO-662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doug Cutting updated AVRO-662:
------------------------------
Attachment: AVRO-662.patch
Here's a patch that adds this feature. A SequenceFileInputFormat is added that
presents sequence file data in a form compatible with Avro's MapReduce API. In
particular, primitive Writable types (LongWritable, Text, etc.) are converted
to corresponding Avro types (Long, CharSequence, etc.), while reflection is
used to infer a schema for complex Writables. The Writable implementation must
be available at runtime, of course.
I also abstracted a FileReader interface and added a SequenceFileReader
implementation. This permits easier integration of SequenceFile and other
formats into Avro tools. For example, it would now be a simple matter to
extend Avro's 'tojson' command to also dump SequenceFile data as JSON.
> Java: Add InputFormat for SequenceFiles using Reflect API
> ---------------------------------------------------------
>
> Key: AVRO-662
> URL: https://issues.apache.org/jira/browse/AVRO-662
> Project: Avro
> Issue Type: New Feature
> Components: java
> Reporter: Doug Cutting
> Assignee: Doug Cutting
> Fix For: 1.4.1
>
> Attachments: AVRO-662.patch
>
>
> It would be useful to be able to read SequenceFile-based data into an
> Avro-based Java mapreduce program. Once the reflect, specific and generic
> representations are fully compatible (AVRO-638) then a RecordReader for
> SequenceFiles could be added that uses Avro's reflect representation.
> AvroOutputFormat could also be changed to accept such reflected data.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.