Optional features (was: Markdown Extra Specification (First Draft))
* Sherwood Botsford [EMAIL PROTECTED] [2008-05-07 22:10]: THAT said, however, maintaining perfect backward compatibility slows down progress. I don’t know. It seems to me perfect backward compatibility is not even possible, considering that Markdown.pl is not set in stone (John takes bug reports and writes fixes, every so often) and yet is not formally defined anywhere. As such, there is no way to say what is backward compatible and what isn’t. I think at most, backcompat for the purposes of a spec for Markdown can only be defined as targetting a particular feature set, but not an exact implementation of it. That is, after all, the entire reason for the spec effort in the first place. Can markdown extra have a configuration file: The default behaviour is to emulate markdown. The configuration file allows for new features that don't fit well into the old set. Optional features are dangerous and impede interoperability. Everyone who ever thinks about chipping in on the design of a spec should read [section 5 of RFC 3339][1]. That RFC is a spec for a particular datetime format, but section 5 is largely agnostic of the nature of the format, and lays down the principles according to which the design decisions for this format were made. [Section 5.3][2] is the part with direct relevance to your stipulation, but the entire section is readworthy. [1]: http://tools.ietf.org/html/rfc3339#section-5 [2]: http://tools.ietf.org/html/rfc3339#section-5.3 One problem is that every new option leads to a geometric increase in the number of feature combinations that have to be tested. Another issue is that Markdown is a document format. If it has many optional features, what are the chances that if I send you a document ostensibly written in Markdown that will work in your implementation of Markdown exactly as it did in mine? You really really don’t want to have to wonder. This was a major reason why SGML mostly failed, f.ex., and only gained traction when it was restandardised as XML. SGML had legions of optional author-friendly features that it made it an extreme amount of work to implement a parser that correctly implemented the entire spec. The XML working group sat down and basically chucked out 95% of the optional features and made the rest mandatory. The rest is history. Optional features in a document format are an invitation for interoperability problems. Since the entire point of the Markdown spec effort was to reduce existing interoperability problems, I strongly advise that as little as possible in the spec be made optional. Ideally, nothing would be. It is, mind, perfectly fine to have two (or maybe three?) specs of which one is a superset of the other, as seems to be Michel’s current thrust with Markdown vs Markdown Extra. Assuming that no feature in either spec is optional, that means you would be able to expect Markdown Extra documents to work in all Markdown Extra processors, and all Markdown documents to work in all Markdown and Markdown Extra processors. The scope of the problem is much smaller in such a scenario, enough so to be perfectly tractable. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/ ___ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss
Re: Optional features (was: Markdown Extra Specification (First Draft))
On 22 May 2008, at 08:10, Aristotle Pagaltzis wrote: [...] Optional features are dangerous and impede interoperability. Everyone who ever thinks about chipping in on the design of a spec should read [section 5 of RFC 3339][1]. [...] I love how it says: [...] A format which includes rarely used options is likely to cause interoperability problems [...] The format defined below includes only one rarely used option: fractions of a second. [...] Which reminds me of when svn started to report fractions of seconds in their ‘svn log --xml’ output, causing a few log visualizers to break :) ___ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss
Re: Optional features (was: Markdown Extra Specification (First Draft))
Le 2008-05-22 à 2:10, Aristotle Pagaltzis a écrit : It is, mind, perfectly fine to have two (or maybe three?) specs of which one is a superset of the other, as seems to be Michel’s current thrust with Markdown vs Markdown Extra. Assuming that no feature in either spec is optional, that means you would be able to expect Markdown Extra documents to work in all Markdown Extra processors, and all Markdown documents to work in all Markdown and Markdown Extra processors. The scope of the problem is much smaller in such a scenario, enough so to be perfectly tractable. I perfectly agree with this by the way: optional features should be kept to a minimum. It may be interesting to note there are currently two configurable parsing-related[^1] in PHP Markdown: Tab width (default = 4) : This one comes from a similar configuration option in Markdown.pl and is essentially the size in spaces for one indent through a Markdown document. When John Gruber says four spaces or one tab in his syntax description document, he really means tab-width spaces or one tab, where tab-width is a configurable parameter defaulting to 4. I'm not aware of anyone changing this parameter, and I'm not even sure of how well it works, but it is clear that changing this will break many documents written with a different tab width in mind. No markup (default = false) No entities (default = false) : This one prevents the parser from skipping over HTML tags and/or HTML character entities. I was originally opposed to it, and in some way I still am. I decided to add it because there was too much people attempting to disable HTML by preprocessing the input with strip_tags or a substitution regular expression without realizing they were breaking automatic links, code spans and code blocks with HTML in them, and sometimes blockquotes. I'm no fan of this mode, but I feel it was the best way to avoid people breaking the syntax by accident, so I've added it in. I'm not sure those features should be formally part of the spec. I believe however that if the spec is well written it should be pretty trivial to see what must be changed to achieve them. [^1]: A parsing-related setting is a setting that changes the interpretation of the document given in output. The oposite is an output-related setting, which changes the HTML output but does not affect the interpretation the parser makes of the document. Michel Fortin [EMAIL PROTECTED] http://michelf.com/ ___ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss
Re: Parsing Code Blocks
Le 2008-05-16 à 0:31, Yuri Takhteyev a écrit : Your first two examples are not treated as the same by any implementation. It seems that all implementations interprete this: ~~~ One Two Three Four Five ~~~ as meaning that One is in a code block, but Two is not. Or did you mean to put a few more spaces in front of Two? Hum, yes I did, and in fact I had. It just looks like my email client (Mac OS X's Mail) eat the first space on each line that begins with a space... I really wish it wasn't using Web Kit as its text editor when in text-only mode. [spec]: http://michelf.com/specs/markdown-extra/#block-element-generator I think it would help if the spec maked it more clear what part of each line of the blockquote is consumed before we go looking for sub-elements, especially as far as consuming initial whitespace goes. Quoting item 2 of blockquote (at the moment you wrote the above): A run of the [block element generator](#block-element-generator) by pushing the following sequence to the varcontext-line-prefix/var stack: 1. Zero or one [insignificant-indent](#insignificant-indent) 2. 3. Zero or one [space](#space) This means that the block element generator is used as a grammar rule at this point. It matches if it can generate one or more block elements. Since each rule in the block generator first checks for a hard-block-content-line-prefix, you could check for yourself that you can match a hard-block-content-line-prefix prior calling the generator (this *could* be more performant). I've added this to the block element generator section: The block element generator is used as a parsing rule in the grammar of the document element generator and the block element generator. The block element generator matches if it one of the following rule matches and creates an element. That said, I decided to revamp the blockquote rule to no longer use directly the block element generator. Everything now passes through a rule named block-element-run, matching one or more block element (using the block-element generator), and the blockquote first is parsed separately in the blockquote rule instead of indirectly from attempting to parse block elements. Does this makes it clearer? By the way, I agree things are not optimal at the moment. They are also way off the tracks of what PHP Markdown and Markdown.pl actually do in many cases. The plan is to start by making something that mostly work. Then I'll compare with the actual regular expressions used in the code and do the adjustments as necessary. After that, I'll compare with test cases in MDTest, and with the output given by other implementations in Babelmark. And I might mix the order a bit. Michel Fortin [EMAIL PROTECTED] http://michelf.com/ ___ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss