date:20180103

Re: [protobuf] Variable-Width Integer Encoding

2018-01-03 Thread Ilia Mirkin

I doubt you're going to get a nice clean answer. Chances are it's
"whatever Sanjay was thinking at the time" which led to the current
encoding, maintained throughout the proto versions for backwards
compatibility with existing data. While APIs have changed over time,
the wire encoding has remained extremely stable.

While we now live in the future, and storage/memory/bandwidth are free
and infinite, and all CPUs are 64-bit, that was not always the case.
Your encoding would not be a clear win for smaller values, and would
be an obvious waste of 4 bits when encoding int32 values.
Additionally, your encoding would not be cleanly extendable to 128-bit
integers in a backwards-compatible way.

The neat thing about the current encoding is that the width of the
integer doesn't really matter -- it could be a 4096-bit integer for
all you know, you just keep reading bytes until you hit one without a
high bit set. Which means it's easy to adjust protos as values change
in allowed ranges, esp between 32 and 64 bits.

But that's just a post-facto justification. It's unlikely that the
original rationale from 15+ years ago exists anywhere.

  -ilia

On Tue, Jan 2, 2018 at 3:54 PM,   wrote:
> What is the rationale behind the current variable-width integer encoding?
>
> As I understand it, an integer is terminated by a byte that's 
> most-significant bit is equal to zero.  Thus, bytes must be read one at a 
> time, and this condition must be checked after reading each one to determine 
> whether to read another.  Why was this encoding chosen over a variable-width 
> encoding that would require at most two reads -- that is, an encoding that 
> specifies the number of subsequent bytes to read in the first byte?
>
> No, I don't mean for the first byte's value to be the length of the rest of 
> the integer.  Rather, the number of leading ones in the first byte could be 
> the number of following bytes.  This would still allow 7 bits of a value to 
> be stored per byte, with the added bonus of a full 64-bit value being encoded 
> in 9 bytes instead of 10.
>
> Examples:
>
> 0 leading ones followed by a terminating zero and then 7 bits:
>
> 0b0...
>
> 1 leading one followed by a terminating zero, then 6 bits, and then 1 byte:
>
> 0b10.. 
>
> 7 leading ones followed by a terminating zero and then 7 bytes:
>
> 0b1110       
>
> 8 leading ones followed by 8 bytes:
>
> 0b        
> 
>
> So, such an encoding is clearly possible.  Why does Protocol Buffers use 
> something different?  Is this to provide some level of protection against 
> dropped bytes?  Has all of the data already been read into a buffer by the 
> time that it is to be decoded, and so reducing the number of reads does not 
> provide much of a speed boost?
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to protobuf+unsubscr...@googlegroups.com.
> To post to this group, send email to protobuf@googlegroups.com.
> Visit this group at https://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

[protobuf] I want to parse binary string into message c++

2018-01-03 Thread 정등혁

#include 
#include 
#include 
#include "EncodeMessage.pb.h"
using namespace std;

string HexString2BinaryString(string sHex)
{
string sReturn = "";
int sLen = sHex.length();
for (int i = 0; i < sLen; ++i)
{
switch (sHex[i])
{
case '0': sReturn.append(""); break;
case '1': sReturn.append("0001"); break;
case '2': sReturn.append("0010"); break;
case '3': sReturn.append("0011"); break;
case '4': sReturn.append("0100"); break;
case '5': sReturn.append("0101"); break;
case '6': sReturn.append("0110"); break;
case '7': sReturn.append("0111"); break;
case '8': sReturn.append("1000"); break;
case '9': sReturn.append("1001"); break;
case 'a': sReturn.append("1010"); break;
case 'b': sReturn.append("1011"); break;
case 'c': sReturn.append("1100"); break;
case 'd': sReturn.append("1101"); break;
case 'e': sReturn.append("1110"); break;
case 'f': sReturn.append(""); break;
}
}

int rLen = sReturn.length();
int i;
for (i = 0; i < rLen; i++)
{
if (sReturn[i] != '0')
{
break;
}
}
sReturn.replace(0, i, "");

return sReturn;
}

// Main function:  Reads the entire address book from a file and prints all
//   the information inside.
int main(int argc, char* argv[]) {
// Verify that the version of the library that we linked against is
// compatible with the version of the headers we compiled against.
GOOGLE_PROTOBUF_VERIFY_VERSION;

// Inset hex type string -> change to binary string
string hexStr;
cin >> hexStr;

string binaryStr = HexString2BinaryString(hexStr);
//cout << binaryStr << endl;

machineInfo::msgOption option;

{
// ** I want to use 'binaryStr' variable to parse 'option' variable
// can you help me out?

//fstream input("output.bin", ios::in | ios::binary);
//if (!option.ParseFromIstream(&input)) {
// cerr << "Failed to parse address book." << endl;
// return -1;
//}
}
// Optional:  Delete all global objects allocated by libprotobuf.
google::protobuf::ShutdownProtobufLibrary();

return 0;
}

I want to parse binary string into message 'option'
Can I make message without using "output.bin" file, by using 'binary 
string' I made in cpp. 

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Re: [protobuf] Variable-Width Integer Encoding

[protobuf] I want to parse binary string into message c++

2 matches

Site Navigation

Mail list logo

Footer information