On 04/02/2013 03:21 AM, Jacob Carlborg wrote:
On 2013-04-01 19:13, Jesse Phillips wrote:

Let me see if I can describe this.

PB does encoding to binary by type. However it also has a schema in a
.proto file. My first concern is that this file provides the ID to use
for each field, while arbitrary the ID must be what is specified.

The second one I'm concerned with is option to pack repeated fields. I'm
not sure the specifics for this encoding, but I imagine some compression.

This is why I think I'd have to implement my own Serializer to be able
to support PB, but also believe we could have a binary format based on
PB (which maybe it would be possible to create a schema of Orange
generated data, but it would be hard to generate data for a specific
schema).

As I understand it there's a "schema definition", that is the .proto
file. You compile this schema to produce D/C++/Java/whatever code that
contains structs/classes with methods/fields that matches this schema.

If you need to change the schema, besides adding optional fields, you
need to recompile the schema to produce new code, right?

If you have a D class/struct that matches this schema (regardless if
it's auto generated from the schema or manually created) with actual
instance variables for the fields I think it would be possible to
(de)serialize into the binary PB format using Orange.

Then there's the issue of the options supported by PB like optional
fields and pack repeated fields (which I don't know what it means).

It seems PB is dependent on the order of the fields so that won't be a
problem. Just disregard the "key" that is passed to the archive and
deserialize the next type that is expected. Maybe you could use the
schema to do some extra validations.

Although, I don't know how PB handles multiple references to the same
value.

Looking at this:

https://developers.google.com/protocol-buffers/docs/overview

Below "Why not just use XML?", they both mention a text format (not to
be confused with the schema, .proto) and a binary format. Although the
text format seems to be mostly for debugging.


Unfortunately, only partially correct. Optional isn't an "option", it's a way of saying that a field may be specified 0 or 1 times. If two messages with the same ID are read and the ID is considered optional in the schema, then they are merged.

Packed IS an "option", which can only be done to primitives. It changes serialization from: > return raw.map!(a=>(MsgType!BufferType | (id << 3)).toVarint() ~ a.writeProto!BufferType())().reduce!((a,b)=>a~b)();
to
> auto msg = raw.map!(writeProto!BufferType)().reduce!((a,b)=>a~b)();
> return (2 | (id << 3)).toVarint() ~ msg.length.toVarint() ~ msg;

(Actual snippets from my partially-complete protocol buffer library)

If you had a struct that matches that schema (PB messages have value semantics) then yes, in theory you could do something to serialize the struct based on the schema, but you'd have to maintain both separately.

PB is NOT dependent on the order of the fields during serialization, they can be sent/received in any order. You could use the schema like you mentioned above to tie member names to ids, though.

PB uses value semantics, so multiple references to the same thing isn't really an issue that is covered.

I hadn't actually noticed that TextFormat stuff before...interesting. I might take a look at that later when I have time.

-Matt Soucy

Reply via email to