What is the rationale behind the current variable-width integer encoding?

As I understand it, an integer is terminated by a byte that's most-significant 
bit is equal to zero.  Thus, bytes must be read one at a time, and this 
condition must be checked after reading each one to determine whether to read 
another.  Why was this encoding chosen over a variable-width encoding that 
would require at most two reads -- that is, an encoding that specifies the 
number of subsequent bytes to read in the first byte?

No, I don't mean for the first byte's value to be the length of the rest of the 
integer.  Rather, the number of leading ones in the first byte could be the 
number of following bytes.  This would still allow 7 bits of a value to be 
stored per byte, with the added bonus of a full 64-bit value being encoded in 9 
bytes instead of 10.

Examples:

0 leading ones followed by a terminating zero and then 7 bits:

0b0.......

1 leading one followed by a terminating zero, then 6 bits, and then 1 byte:

0b10...... ........

7 leading ones followed by a terminating zero and then 7 bytes:

0b11111110 ........ ........ ........ ........ ........ ........ ........

8 leading ones followed by 8 bytes:

0b11111111 ........ ........ ........ ........ ........ ........ ........ 
........

So, such an encoding is clearly possible.  Why does Protocol Buffers use 
something different?  Is this to provide some level of protection against 
dropped bytes?  Has all of the data already been read into a buffer by the time 
that it is to be decoded, and so reducing the number of reads does not provide 
much of a speed boost?

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to