[ https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969031#action_12969031 ]
Allen Wittenauer commented on HADOOP-6685: ------------------------------------------ > works against integrating external configuration systems with existing > components I'm thinking of when we are past the limitations of the existing components. What if we don't pass files around for configuration information at all? Then does making sure that everything can be represented as a UTF-16 string make sense? I don't think it does. > Do we have much binary configuration data? Given that it is currently impossible, the answer is obviously no. But this seems like a major flaw of the existing system. Who are we to dictate what the user can/can't put in what is essentially a private part of the configuration name space? Hadoop as a framework shouldn't care what the representation of that value is if it doesn't have to read it. If I want to build a mass documentation signing system and provide the binary representation of the CA cert as a configuration option to my serializer, why shouldn't I be able to do that? If I want to work in UTF-32 and pass information as a config option to my serializer, why shouldn't I be able to do that? Now one could argue that I could base64 encode my data or do the wacky !!binary thing that YAML does (JSON doesn't support binary, so to me, that instantly eliminates it. Even crusty x.500 supports binary! ... and XML... well, you all know how I feel about it. *smile*). But why should I take a performance hit to support my use case? I don't see the value in support the existing system when it has what I would say is a major flaw. > Change the generic serialization framework API to use serialization-specific > bytes instead of Map<String,String> for configuration > ---------------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-6685 > URL: https://issues.apache.org/jira/browse/HADOOP-6685 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Fix For: 0.22.0 > > Attachments: serial.patch, serial4.patch, serial6.patch, > serial7.patch, serial9.patch, SerializationAtSummit.pdf > > > Currently, the generic serialization framework uses Map<String,String> for > the serialization specific configuration. Since this data is really internal > to the specific serialization, I think we should change it to be an opaque > binary blob. This will simplify the interface for defining specific > serializations for different contexts (MAPREDUCE-1462). It will also move us > toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.