On Oct 21, 2003, at 10:41 AM, Elizabeth Mattijsen wrote:

At 12:53 -0400 10/21/03, Dan Sugalski wrote:
> Yeah, if you're just needing to tag the stream with a label to indicate
the type plus a version number, then xml's on the one hand overkill and
> on the other hand not necessarily a big help to xml proponents.
So, in a nutshell, throwing an XML format type tag at the beginning buys
us nothing regardless of whether it's an XML stream or not?

Yep. But mainly I think because you'll need to encode binary data to make it valid XML. That's on overhead you don't to suffer for those serialization that don't need it.


If you ask me, you could do easy with a simple header line like:

  parrot xml 1.0
  \0

basically magic word ('parrot')
 followed by a space
 followed by the type
 followed by a space
 followed by version
 followed by a CRLF (not sure about this one, but could be nice)
 followed by a null byte

Yep, that's the sort of thing that I was thinking, though I'd actually leave the CRLF (or just an LF or CR, whatever), and take out the null byte. My reason for that is that this way, if your serialization format always spits out vanilla ASCII w/o control characters, suitable for consumption by some foreign C program, then the header won't change this. (That's one of the nice features of the tar format--a tar archive of ASCII text file is itself an ASCII text file, if I recall corrrectly.)


It could also be handy to allow additional "comment" text after the version (ignored by the deserialization, restricted to be ASCII w/o any CR or LF), because that would let you put in some human-readably comment to help out people trying to figure out what this file is. Some other formats to this, which is nice. Just another thought.

JEff



Reply via email to