[ 
https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-6685:
----------------------------------

    Attachment: serial.patch

Ok, here is a preliminary patch. 

It includes support for Avro, Thrift, ProtocolBuffers, Writables, Java 
serialization, and an adaptor for the old style serializations. One of the 
features of the Avro serialization is that the kind ("reflection", "specific", 
"generic") is a parameter that can be changed between writing and reading the 
file.

All of the types can be put into SequenceFiles, MapFiles, BloomFilterMapFiles, 
SetFile, and ArrayFile.

In a separate issue, I'll upload the OFile wrapper that goes on top of TFile to 
allow all of the types into TFiles as well.

It creates a new package o.a.h.io.serial that defines the new interfaces. The 
new serializations save their metadata in a framework specific format. To make 
the format extensible, I've use protocol buffers to encode this information. 
This will allow us to make arbitrary compatible extensions later.

> Change the generic serialization framework API to use serialization-specific 
> bytes instead of Map<String,String> for configuration
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6685
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: serial.patch
>
>
> Currently, the generic serialization framework uses Map<String,String> for 
> the serialization specific configuration. Since this data is really internal 
> to the specific serialization, I think we should change it to be an opaque 
> binary blob. This will simplify the interface for defining specific 
> serializations for different contexts (MAPREDUCE-1462). It will also move us 
> toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to