Eric Kow wrote:
Long term [Darcs 3]
-------------------
A new patch format in general may be interesting for the long term.
http://bugs.darcs.net/patch1096 appears to be a step in that direction.

Long term I'd like a pony, but more importantly for darcs patches to be in some easy to parse markup format like JSON, perhaps.

Medium term [Polished Darcs 2]
------------------------------
I claim that this coming up with this new patch format is unrealistic
for the medium term (defined as post-performance-obsession and
pre-Darcs-3).  If we were to use anything better, it'd have to be
backwards-compatible (ie. using the patch long comment?)

Therefore, it would be interesting to determine if

1. If a new backwards compatible format will be useful in the medium
   term [which could last for many years mind you, if you also add in
   the short-term], or if we can get away with using Ignore-this for
   that time

2. If the new format could just start with "Ignore-this:"

3. What the new format would actually look like

We don't have to open this discussion now, but it's now being tracked as
a potential project in <http://bugs.darcs.net/issue1787>.  My request is
for whoever launches the third salvo in this discussion please research
the past threads (eg. when we introduced the Ignore-this salt for
issue27?) and link them here

There's also some very interesting future work on patch annotations
<http://bugs.darcs.net/issue1613> for optional metadata.  It may even
be medium-term if we're lucky.

When Ignore-this was first implemented the medium term solution of using a full RFC822 email-like header was broached. Of course, RFC822 is full of loopholes and surprisingly hard to parse in reality, but the obvious point that Ignore-this: xxx does indeed look like an email header still stands. (I'd like to remain on the record that I'd still prefer a better name like "Patch conflict avoidance hash" than Ignore-this, by the way.)

I've been thinking on this some, and I think I have a reasonable suggestion that is easier to parse than RFC822, but carries a similar effect: YAML formatted darcs comments.

YAML (yaml.org) is a JSON superset that was designed to be more human-readable/human-editable than JSON. Since long comments are still meant to be examined (and perhaps amended) by us humans, I'm all for keeping markup to a reasonable minimum. However, YAML is still easy to parse, with libraries in many languages.

Here's Ignore-this wrapped in an explicit YAML document:

  %YAML 1.2 # YAML version directive, can be used as indicator
  --- # document start
  Ignore-this: xxx # same as currently, but now in a YAML mapping
  ... # document end

We could argue the usefulness of the explicit YAML directive and document start (---), but explicit document end (...) makes a clear separation between any darcs-interesting metadata and a user's actual content: both to simple regex searching, and to YAML parsers (which have the concept of "parse the first document" and "parse past the first document"). (Certainly an explicit marker is better than RFC822's sometimes difficultly implicit marker.)

Of course, the above example doesn't seem too great with just Ignore-this, so here's a better example:

  %YAML 1.2
  ---
  Ignore-this: yyy
  Encoding: UTF-8
  Patch version: 2.0+YAML
  X-Musdex version: 10.03.22
  ...


So, backwards compatibility issues: much the same as with Ignore-this. Patches with long comments with YAML headers get the headers output in version of darcs prior to the switchover point. This may not be a big problem, for instance, the above example in darcs 2.4 changes output seems reasonable:

  %YAML 1.2
  ---
  Encoding: UTF-8
  Patch version: 2.0+YAML
  X-Musdex version: 10.03.22
  ...

The big gain is the forwards compatibility for arbitrary headers without special casing each and every one or prefixing them all with the silly "Ignore-this:" tag. It also would be presumably be forwards compatible with some nice long term future version of darcs where arbitrary metadata headers can be moved out of the long comment to someone more preferable.

Additional gain is that ignorable header lines now have two strongly consistent ways of being handled by scripts: 1) parse the first YAML document in the long comment to get the headers, 2) ignore everything to the first line that begins with an ellipsis (...) to get to the user comment. In both cases a first line beginning with %YAML can be used to denote that there is any header at all.

So that's my current suggestion. Feel free to tear it apart.

--
--Max Battcher--
http://worldmaker.net
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to