Hi, I understand your concern, but I think you're over-stating the problem (or failing to note the slow direction we're already moving in). Internally, most of monotone's text formats are converging to a single style: basic_io. Contrary to what you've suggested, whitespace is not significant in parsing basic_io. Whitespace *is* dictated and normalized in basic_io, because the output of basic_io is meant to be hashed, and whitespace is still hashable bytes. We have to use *some* level of custom-notation because almost no other text notations normalize whitespace. Hashing imposes unique requirements.
The framework for parsing basic_io is LL(1), predictive. As simple as possible. Read a token. The token must be a symbol. If the symbol is one you know, branch into the thing which reads the sequence of tokens (symbols, strings, or hex-blobs) for that symbol. Repeat. It's like s-expressions or JSON, only it causes slightly less gnashing of teeth of you print it out to a user. Initially, basic_io was JSON-ish, but everyone complained that there were too many braces and colons, and wanted "special formats" for contexts where users see it. So we revised it to look nicer: the whitespace creates visual groupings in the token stream, but is not significant to parsing. Revisions, MT/options, MT/work, and .mt-attrs use basic_io. Manifests and .mt-attrs are going away, replaced by rosters. Rosters use basic_io. I intend all future formats in monotone to use basic_io. basic_io was designed to be *relatively tolerable* for humans to read as well as machines. Hence the stanza-alignment, stanza-based line breaking, and use of quoting. This was to reduce the need for multiple output formats in the "status" and "commit" commands: we just print the internal, hashed representation to the screen. My intention is to change the "ls certs", "ls keys" and "log" outputs to be basic_io someday as well; probably when certs get their much-needed overhaul. Sooner if someone else helps :) There is a small, mostly-unused corner of monotone's i/o machinery called "packets", which were intended, long ago, to generate and consume non-whitespace-sensitive transport encodings, for example, when sending things through email. This was added back when whitespace actually played a role in some parsing operations. Since all of the packet objects in question are being shifted to basic_io anyways (which normalizes whitespace after parsing whitespace-insensitively), we should probably discard the packet format too. A lot of this is about available effort and time. Some commands, as you've noted, still stand out. I can identify three families of commands here: 1. There are commands which produce simple lists of newline-delimited filenames. 2. There are 'automate' commands which produce custom formats. 3. There is a --brief format. I am fine with #1, in the sense that there is already special command-line treatment for filenames (all non-option, positional arguments are, or should be, treated as filenames), and the "unix tradition" of for example piping a list of filenames to xargs is worth supporting. I did not implement #2 or #3, and honestly I would not have done them the way they're done. But you know, it's not 100% under my control. Not even 10%. I think I was even on vacation when automate happened. I can tell you what my preference is, though: I'd prefer if the automate commands all pumped out basic_io stanzas. I'd prefer if you could send basic_io stanzas to monotone as command sequences (say, for monotone stdio). I'd prefer if all commands could be invoked via stdio. And I'd prefer, rather than per-command things like --brief, that we do what marcel suggested (and what, if you look, the ROADMAP file has listed for some time), and give lua hooks a chance to control output formatting in general, so that if you *don't* want basic_io, there's something simple and general you can do about it. -graydon _______________________________________________ Monotone-devel mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/monotone-devel
