[jira] Updated: (HADOOP-2603) SequenceFileAsBinaryInputFormat

Chris Douglas (JIRA) Mon, 14 Jan 2008 16:24:58 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Chris Douglas updated HADOOP-2603:
----------------------------------

    Attachment: 2603-1.patch

bq. Unless one knows the Writable serialization, one cannot use this, right ?

Pretty much. There are getKeyClassName and getValueClassName methods on 
SequenceFileAsBinaryRecordReader that return names of the key and value 
classes. If the reader doesn't have- or doesn't care about- the classes 
associated with the bytes they're reading from a SequenceFile, then this should 
permit them to read records without interpreting them through Writables or 
loading the key/value classes. Sampling records is a good example.

This updated patch effects this (with changes to SequenceFile.Reader that defer 
key/value classloading until required) and adds documentation missing from the 
former patch.

> SequenceFileAsBinaryInputFormat
> -------------------------------
>
>                 Key: HADOOP-2603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2603
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>             Fix For: 0.16.0
>
>         Attachments: 2603-0.patch, 2603-1.patch
>
>
> Add an InputFormat to read the raw bytes as keys, values from a SequenceFile

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2603) SequenceFileAsBinaryInputFormat

Reply via email to