On 3/22/2010 14:34, Jason Dagit wrote:
On Mon, Mar 22, 2010 at 11:02 AM, Max Battcher <[email protected]
<mailto:[email protected]>> wrote:
Long term I'd like a pony, but more importantly for darcs patches to
be in some easy to parse markup format like JSON, perhaps.
What is the concern you're addressing? What if we made the darcs patch
parser callable from C? Would that be good enough? Why do you want the
patches to be easy to parse in a markup format?
Like I said, it is a wish for a pony. I don't have any specific need in
mind, but "wouldn't it be nice if...". Given a long term format change,
I would always prefer something standardized and well known over
something proprietary and possibly prone to break in unexpected ways.
Certainly if a standard markup format were in use already by darcs we
wouldn't have as much problems adding metadata or changing patch formats
to meet the needs of today.
So, be it JSON or YAML or Binary XML or Google Protocol Buffers or
something else I haven't considered, it doesn't really matter: the
intent is that it should be something usefully extensible with known
efficient parsers and known operating requirements. Which is to say that
I appreciate your arguments for efficiency, Jason, but precisely because
of those arguments I've come to strongly appreciate well known parsers
over hand-built ones, because I know the "operating efficiencies"...
(As in, I know the relative strengths and weaknesses of the various XML
parsers at my disposal in Python or C#. I know which ones call C backing
libraries, and I know which ones I'd pick for ease of use and which ones
for power and which ones for optimal speed/memory. I can choose one to
use based on the requirements of the current project. Same for YAML or
JSON... But each and every "special" or "proprietary" parser brings its
own learning curve.)
When Ignore-this was first implemented the medium term solution of
using a full RFC822 email-like header was broached. Of course,
RFC822 is full of loopholes and surprisingly hard to parse in
reality, but the obvious point that Ignore-this: xxx does indeed
look like an email header still stands. (I'd like to remain on the
record that I'd still prefer a better name like "Patch conflict
avoidance hash" than Ignore-this, by the way.)
Yeah. I think that's fair. Are there no parsers for RFC822 on
Hackage? I see this:
http://hackage.haskell.org/packages/archive/mime/0.3.2/doc/html/Codec-MIME-Parse.html
Does that provide the type of parser you're looking for?
RFC822 is an ugly standard to parse: headers end at the first empty
line, except in the case when a malformed gateway adds extra spaces
everywhere, in which case it might be any invisible line that "seems
correct"... RFC822 is still a better standard than the current lack of
a standard for Ignore-this headers, but not by much.
I've been thinking on this some, and I think I have a reasonable
suggestion that is easier to parse than RFC822, but carries a
similar effect: YAML formatted darcs comments.
>
That YAML snippets seem pretty reasonable as long as they don't require
the parser to hit an ending tag while parsing the patches themselves
(seems reasonable for a short-ish section of headers though).
YAML was designed for streaming, definitely. In particular, even the
most inefficient parser should respect the explicit end of document
marker (...) and not need to parse past it before returning results. All
of the YAML parsers I've seen are generally much more efficient than
that, of course, and I think the YAML specs make it relatively clear how
self-contained and easy to parse all of the markup is.
For the
patches I really think we want a format that is more amenable to
streaming or seeking. You could imagine it having a "table of contents"
section with offsets that can be seek'd to. I guess strictly speaking
that is doable in an XML schema, but perhaps uncommon.
Seeking probably would be a good property to include on the list of
features to prefer when searching for a new long term patch format.
--
--Max Battcher--
http://worldmaker.net
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users