[ https://issues.apache.org/jira/browse/HADOOP-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Douglas updated HADOOP-2603: ---------------------------------- Attachment: 2603-1.patch bq. Unless one knows the Writable serialization, one cannot use this, right ? Pretty much. There are getKeyClassName and getValueClassName methods on SequenceFileAsBinaryRecordReader that return names of the key and value classes. If the reader doesn't have- or doesn't care about- the classes associated with the bytes they're reading from a SequenceFile, then this should permit them to read records without interpreting them through Writables or loading the key/value classes. Sampling records is a good example. This updated patch effects this (with changes to SequenceFile.Reader that defer key/value classloading until required) and adds documentation missing from the former patch. > SequenceFileAsBinaryInputFormat > ------------------------------- > > Key: HADOOP-2603 > URL: https://issues.apache.org/jira/browse/HADOOP-2603 > Project: Hadoop > Issue Type: New Feature > Components: mapred > Reporter: Chris Douglas > Assignee: Chris Douglas > Fix For: 0.16.0 > > Attachments: 2603-0.patch, 2603-1.patch > > > Add an InputFormat to read the raw bytes as keys, values from a SequenceFile -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.