Re: [protobuf] Storing protocol buffers: binary vs. text

Christopher Head Sun, 11 Dec 2011 11:48:09 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

If you're worried, just spec your format such that when you write a
binary-format file, you shove a magic unprintable signature on the
front. Now when you want to load data, try to parse it as binary first
(looking for the signature) and, if it doesn't appear, then try to
parse as text. Guaranteed no collisions.


Chris

On Sat, 10 Dec 2011 11:05:39 -0800 (PST)
Tom Swirly <tom.ritchf...@gmail.com> wrote:

> Hello, proto-people.  I suspect some of you know me already...
> 
> I'm just finishing a moderately large project that makes extensive
> use of protocol buffers as a storage format for data files for a
> consumer desktop application.  (The protos are working extremely
> well, of course, and I have a really slick object-oriented
> persistence mechanism with them that's really useful, but that's for
> another day).
> 
> I have a flag that lets me store the protocol buffers either as text
> (using the Print* and Parse* methods from
> google::protobuf::TextFormat) or serialized.  It's of course much
> easier to keep them as text when I'm developing, and since the files
> are pretty tiny (though there are a lot of them) I'm thinking of
> keeping them as text files even for the first public release.
> 
> But this got me thinking.  If I see a file I haven't seen before that
> might be either a binary proto or a text proto, why can't I try to
> parse it as text, and then if that fails, as binary?
> 
> Yes, yes, this has some spiritual dubiousness.  Nothing in the proto 
> definition precludes the idea that the binary form of one proto
> buffer cannot be the text form of another.
> 
> And there's certainly the case of the "empty file" - which could
> either be the text string representing the default protocol buffer,
> or the binary string representing that same protocol buffer.  But in
> that case, I don't care.
> 
> But practically speaking, I don't see how this would not work.  If I
> try to read a binary format as text, then Between the wire types and
> my protocol buffer field IDs (which are all less than 32), the text
> parsing has to run into an unprintable byte very soon and terminate...
> 
> Am I right?  It's not a big deal if not...
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)

iEYEAREDAAYFAk7lCPAACgkQXUF6hOTGP7feyQCggg0jHPq495NPQXCVR8Rhw7C3
vmgAn0TiXDiYw1eW+6i14NB0j6RYtEcT
=h7/A
-----END PGP SIGNATURE-----

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Re: [protobuf] Storing protocol buffers: binary vs. text

Reply via email to