[ 
https://issues.apache.org/jira/browse/HDFS-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068614#comment-13068614
 ] 

Todd Lipcon commented on HDFS-2060:
-----------------------------------

We had a bit of discussion about this at the contributors meeting a few weeks 
ago (the week of the summit). My takeaways from that meeting were:

- Several people expressed an opinion that it would be nicer to not have 
protobuf-specific code in any HDFS classes. Sidd described the approach used in 
MR2. If I understood him correctly, it uses a class structure like:

{code}
interface FooWireType {
  long getBlah();
  void setBlah(long x);
  ... getters and setters ...
  ... serialization/deseriailization stuff?...
}

class FooWireTypeProtoImpl implements FooWireType {
  // wraps FooWireProto, which is the generated class
}

interface WireTypeFactory {
  FooWireType createFooType();
  BarWireType createBarWireType();
}

class WireTypeProtoFactory implements WireTypeFactory {
  // returns *ProtoImpl implementations
}
{code}

The upside of this approach is that it would be possible to switch 
serialization mechanisms (eg to avro or thrift) without changing any of the 
code in the DFS layer -- just need to implement a different WireTypeFactory. 
The downside of this approach is that it requires a bunch of boilerplate 
interfaces and classes to be constructed. It would be possible to do this via 
code-gen, but no one has a working code generator at this point.

- I argued that, while the above is nicer, it would be more expedient in the 
short term to just implement this based on protobufs. I already summarized my 
reasoning [in this 
comment|https://issues.apache.org/jira/browse/HDFS-2058?focusedCommentId=13047289&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13047289].
 The one-sentence version is that we need to move forward ASAP on this, and 
having something that works now is better than taking months to do something 
slightly more general.


So, I would like to propose moving forward with the approach I outlined in this 
JIRA and the demonstration patch. I can commit time to doing this. If others 
find the approach unsatisfactory and can commit time to doing the more general 
mechanism on trunk in the short term, that would be great, but I don't want to 
put off client compatibility much longer. I also don't think we should move 
forward with the general mechanism until we have a reasonable code-gen 
infrastructure ready -- it's just too much boilerplate to write and maintain.

> DFS client RPCs using protobufs
> -------------------------------
>
>                 Key: HDFS-2060
>                 URL: https://issues.apache.org/jira/browse/HDFS-2060
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-2060-getblocklocations.txt
>
>
> The most important place for wire-compatibility in DFS is between clients and 
> the cluster, since lockstep upgrade is very difficult and a single client may 
> want to talk to multiple server versions. So, I'd like to focus this JIRA on 
> making the RPCs between the DFS client and the NN/DNs wire-compatible using 
> protocol buffer based serialization.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to