[ https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966667#action_12966667 ]
Doug Cutting commented on HADOOP-6685: -------------------------------------- > We don't have one right now. We have XML and JSON. Neither is user-friendly. We don't use currently use JSON for configuration data. Today we use Map<String,String> as the configuration data model. This is usually serialized as XML and sometimes in other forms (e.g., inside a SequenceFile). The simplicity of this model permits differing serializations without significant loss of transparency or interoperability. This model interoperates well with Java properties, including system properties, with environment variables, etc. Appending a prefix to keys has been demonstrated to be an effective if inelegant way to implement nesting in this model. This model does not easily map to objects, nor does it provide any type support. If we wish to use a more complex data model, that's nestable, that's more strongly typed and that can be easily mapped to objects, then a standard serialization, like JSON or YAML, is a good way to still ensure transparency and interoperability. YAML could work well as a data model. Nesting YAML requires adjusting indentation, while JSON permits simple string appends to nest. But if a Java API like YamlBeans is used, then indentation would be handled automatically. If we can read/write YAML, what reason is there to support arbitrary binary configuration data? > Change the generic serialization framework API to use serialization-specific > bytes instead of Map<String,String> for configuration > ---------------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-6685 > URL: https://issues.apache.org/jira/browse/HADOOP-6685 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Fix For: 0.22.0 > > Attachments: libthrift.jar, serial.patch, serial4.patch, > serial6.patch, serial7.patch, SerializationAtSummit.pdf > > > Currently, the generic serialization framework uses Map<String,String> for > the serialization specific configuration. Since this data is really internal > to the specific serialization, I think we should change it to be an opaque > binary blob. This will simplify the interface for defining specific > serializations for different contexts (MAPREDUCE-1462). It will also move us > toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.