Jared Williams wrote: > The problem is the ambiguity with italics, (''italics''). So the > current parser doesn't really make its final decision on > what should be bold or what should be italic until it hits a > newline. If there are an even number of both bold and italics > then it assumes it interpreted the line correctly. [SNIP] > I think this is part of what makes wikitext undescribable > in a formal grammar.
And he also wrote: > Problem is quotes are also valid as part of the textual content, so > could not italics immediately before or after an apostrophe. As in > > L'''arc de triomphe'' > > Which the current parser resolves to L'<i>arc de triomphe</i> There lies one of the main problems with parsing wikitext - that it uses a wide range of standard text characters to implement it's markup. In HTML, there are basically two (< and >) plus an escape character (&). Therefore HTML can in theory[1] consist of "Any text you like, with <, > and & replaced by < > and & respectively" with two special markup symbols (<markup goes here> and &escaped_entity;). No room for ambiguity there, and only minimal translation required to convert plain-text to a format suitable for use in an HTML document. In MediaWiki, just taking that single ' character as an example, it could be one of several punctuation symbols (apostrophe, single-quote, prime, etc.) or it could be part of an opening italic tag, a closing italic tag, an opening bold tag, or a closing bold tag. As far as I understand, it is impossible to deal effectively with this massive overloading of the apostrophe character without the kind of special logic we have in place already (as described by Jared). To take his example one step further, here's something to really throw a formal grammar-based parser, but which our parser handles just fine: '''Photo of L'''arc de triomphe'' by 'John'''' - Mark Clements (HappyDog) [1] I'm ignoring all the document-structure requirements, plus character-encoding issues, etc. that complicate things a bit. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l