[
https://issues.apache.org/jira/browse/HADOOP-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HADOOP-2603:
----------------------------------
Attachment: 2603-1.patch
bq. Unless one knows the Writable serialization, one cannot use this, right ?
Pretty much. There are getKeyClassName and getValueClassName methods on
SequenceFileAsBinaryRecordReader that return names of the key and value
classes. If the reader doesn't have- or doesn't care about- the classes
associated with the bytes they're reading from a SequenceFile, then this should
permit them to read records without interpreting them through Writables or
loading the key/value classes. Sampling records is a good example.
This updated patch effects this (with changes to SequenceFile.Reader that defer
key/value classloading until required) and adds documentation missing from the
former patch.
> SequenceFileAsBinaryInputFormat
> -------------------------------
>
> Key: HADOOP-2603
> URL: https://issues.apache.org/jira/browse/HADOOP-2603
> Project: Hadoop
> Issue Type: New Feature
> Components: mapred
> Reporter: Chris Douglas
> Assignee: Chris Douglas
> Fix For: 0.16.0
>
> Attachments: 2603-0.patch, 2603-1.patch
>
>
> Add an InputFormat to read the raw bytes as keys, values from a SequenceFile
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.