I posted this question on stackoverflow yesterday but haven't got much 
response 
(http://stackoverflow.com/questions/27955519/can-field-tags-be-dropped-from-protobuf-thrift-messages).
 
I figure it's better to ask in this group.

--

I understand that protobuf needs unique numerical field tags to provide 
version compatibility. They provide version compatibility by serializing 
messages (kind of) in this fashion:

<tag1> <value1> ... <tagN> <valueN>

When deserializing, they pick up the tag value, looks up message schema, 
and knows which field to fill the value into. In this way, as long as we 
add new fields with different tag value, the messages will be compatible.

But I don't think this is a very good design:

   1. 
   
   The tag value has to be encoded within the message. This has some 
   overhead.
   
   For example. When a client invokes an RPC method on a remote server many 
   times, the tag values in every request/response are the same. It would be 
   nice to only send <tag1> <value1> ... <tagN> <valueN> once, and then 
   only send <value1> ... <valueN>.
   2. 
   
   When changing the type of a field, we also need to change the tag value. 
   Forgetting to do this will lead to bugs.
   3. 
   
   Developers have to ensure tag values are unique. Usually people track 
   the last used tag id and increase it when adding new fields. But when two 
   people add fields in separate branches and make a merge, it's hard to 
   resolve conflict.
   
I think a better design could be:

Create a compact schema for each message type, like this:

<field_name_1> <field_type_1> ... <field_name_N> <field_type_N> (sorted 
according to field_name)

To address issue 1, exchange message schema before doing anything. For the 
RPC example, the client will send its message schema before sending first 
RPC, then in the following RPC, it only sends <value_1> ... <value_N>. The 
server will have message schema when request arrives, and knows how to 
deserialize it.

To address issue 2, when the field type is changed, the compact message 
schema will be changed, too. Programs will be able to find out the old and 
new schema does not match, and reports error.

To address issue 3, developers no longer need to take care of assigning 
unique tag values. They still need to take care of assigning unique field 
names, but this should be easier, and less likely to lead to merge 
conflicts.

Could this be a usable design? And what will be the problems of it?

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to