[ https://issues.apache.org/jira/browse/HADOOP-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559262#comment-13559262 ]
Binglin Chang commented on HADOOP-8990: --------------------------------------- bq. Note that since rpc has pluggable rpc engine, the rpc request itself is serialized based on the pluggable rpc engine. For protobuf-rpc-engine the length is serialized as varint. I don't understand why protobuf-rpc-engine has to be varint prefixd? I mean originally it is 4 byte int prefixed, and HADOOP-8084 changed it to varint prefixed, and then at some time later RpcResponseWritable changed to using varint, and RpcRequestWritable keep using 4 byte int. I just think 4 byte int is a better choice, because it easy to implement in non-blocking code. Just for example, see Todd's ipc implementation: https://github.com/toddlipcon/libhrpc/blob/master/rpc_client.cpp {code} uint32_t length; CodedInputStream cis(&rbuf_[rbuf_consumed_], rbuf_available()); bool success = cis.ReadVarint32(&length); if (!success) { if (rbuf_available() >= kMaxVarint32Size) { // TODO: error handling LOG(FATAL) << "bad varint"; } // Not enough bytes in buffer to read varint, ask for at least another byte EnsureAvailableLength(rbuf_available() + 1, boost::bind(&RpcClient::ReadResponseHeaderLengthCallback, this, asio::placeholders::error)); return; } {code} There are lot of retry logic to read varint in non-blocking code, it is much easier if it's fixed 4 byte, and hence unified. > Some minor issus in protobuf based ipc > -------------------------------------- > > Key: HADOOP-8990 > URL: https://issues.apache.org/jira/browse/HADOOP-8990 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Binglin Chang > Priority: Minor > > 1. proto file naming > RpcPayloadHeader.proto include not only RpcPayLoadHeaderProto, but also > RpcResponseHeaderProto, which is irrelevant to the file name. > hadoop_rpc.proto only include HadoopRpcRequestProto, and the filename > "hadoop_rpc" is strange comparing to other .proto file names. > How about merge those two file into HadoopRpc.proto? > 2. proto class naming > In rpc request RpcPayloadHeaderProto includes callId, but in rpc response > callId is included in RpcResponseHeaderProto, and there is also > HadoopRpcRequestProto, this is just too confusing. > 3. The rpc system is not fully protobuf based, there are still some Writables: > RpcRequestWritable and RpcResponseWritable. > rpc response exception name and stack trace string. > And RpcRequestWritable uses protobuf style varint32 prefix, but > RpcResponseWritable uses int32 prefix, why this inconsistency? > Currently rpc request is splitted into length, PayLoadHeader and PayLoad, and > response into RpcResponseHeader, response and error message. > I think wrap request and response into single RequstProto and ResponseProto > is better, cause this gives a formal complete wire format definition, > or developer need to look into the source code and hard coding the > communication format. > These issues above make it very confusing and hard for developers to use > these rpc interfaces. > Some of these issues can be solved without breaking compatibility, but some > can not, but at least we need to know what will be changed and what will stay > stable? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira