On 09/03/2010 04:39 AM, Eric Kow wrote:
> Is there a way to have both cake (very very easy to parse) and eating
> (sufficiently expressive to do anything Darcs would reasonably want to
> do with machine-readable outputs)?

Short answer: No.

* "Very very easy to parse" seems like a good feature.
   And there is nothing easier to parse than simple line-based
   like the above. Even JSON (with the json library) imposes a
   little bit of friction...

Just because it is "easy to parse" doesn't mean I want to maintain a parser for it. In my experience "easy to parse" things often mean a long-term maintenance hell. What happens when someone needs more information added to the feed? What happens when data-types change? Does darcs need to keep a large zoo of parsers in a myriad of languages and platforms as a regression suite as a guard against breaking user's tools out "in the wild"? Or would the preference be to rubber-stamp and thus maintain "official" parser libraries for a large number of languages/platforms?

The markup/object formats (JSON, YAML, XML, etc) are large, ugly and verbose for a reason: they try their best to support both backwards-compatibility and forwards-compatibility. Schema changes are a lot easier to make backwards-compatible than "parser changes".

Any attempt at a smaller format is eventually going to have to answer the same questions that a markup/object format deals with in its spec, and without a *lot* of planning before hand (effectively remaking a markup format in the process) is doomed to do so haphazardly and with ever so much "cruft".

I do think the best bet is to pick and existing markup standard with a good specification and support it *well*. Even if the answer is just to beef up the current XML output.

* Human-readable (even if it's machine-oriented) could be a nice
   minor feature [it lends a sort of transparency]

This seems like a rather low priority. I think that you have to assume that any machine-oriented output is primarily destined to be piped directly into tools with little or no human involvement. Certainly the human readable output may always be preferable.

I've pointed out before that if this is considered something of a priority, however, I think YAML is a good candidate. YAML has human-readable configurations and goals within its spec. I've pointed out before that ``darcs show repo`` as it already is, is nearly YAML already [1] and I have pointed out before that I think it may be worth tweaking it to make it valid YAML.

* Perhaps another feature would be a sort of uniformity, that all of
   Darcs machine-readable outputs work the same way.  Can we achieve
   such a uniformity with just a regular language?

I think that in trying something like that you end up with either N mini-languages that are "mostly similar" or a half-baked markup with a poor specification.

* As far as I'm concerned, "not-XML" is a feature.
   I think that's just a silly knee-jerk reaction on my part, though

XML is not the enemy here. We're talking about passing around data that other applications can read and XML is a fine solution for that, particularly when implemented correctly.

I see no problem in picking a markup language with simpler dependencies than XML or that are easier to validly output than XML, but I do think it makes sense to stick with a markup language of one sort or another, with existing specifications and existing well-known parsers in the wild, than building an arbitrary new one without *strong* reason to do so.

----

[1] Using ``darcs show repo`` from darcs 2.3.0 as an example:

          Type: darcs
        Format: hashed, darcs-2
          Root: /home/worldmaker/repos/darcsforge
      Pristine: HashedPristine
Cache: thisrepo:/home/worldmaker/repos/darcsforge, cache:/home/worldmaker/.darcs/cache
Default Remote: code.worldmaker.net:repos/pub/darcsforge/main/
   Num Patches: 155

Primarily, YAML barfs on the right-aligned keys (because YAML's more human readable formatting is whitespace-dependent). Reformatted to valid YAML, but preserving the attempted alignment:

Type:           darcs
Format:         [hashed, darcs-2]
Root:           /home/worldmaker/repos/darcsforge
Pristine:       HashedPristine
Cache:
                - thisrepo:/home/worldmaker/repos/darcsforge
                - cache:/home/worldmaker/.darcs/cache
Default Remote: code.worldmaker.net:repos/pub/darcsforge/main/
Num Patches:    155

I think this is just as readable, but now YAML also parses it, with both Format and Cache being (correctly) interpreted as lists and YAML also picks up that 155 is a numerical literal (and thus is an integer in Python, for example). Of course, this example is easy because there aren't any special characters in my paths above that required escaping.

--
--Max Battcher--
http://worldmaker.net
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to