> On Jul 12, 2014, at 2:52 PM, Michel Fortin <michel.for...@michelf.ca> wrote:
> [snip]
> When you have a question like this, just try it Babelmark 2:
> http://johnmacfarlane.net/babelmark2/?normalize=1&text=%3Cdiv%3E

Yes, that's what we all do. And to answer your other question, notice that only 
two of the implementations on Babelmark2 failed. Remember, most of these 
implementations were written to be run on web servers. We can't have our web 
servers crashing just because a user submitted invalid markdown. What a parser 
doesn't understand is just passes through. What it misunderstands is garbles 
but it is specifically designed to never choke.

As Michel alluded to, most parsers are simply a series of regular expression 
substitutions which are run in a predetermined order. If a regex never matches 
a part of the text, then that part passes through untouched. Yes, that means 
the HTML is parsed by regex - which we all know is a bad idea -- but it is not 
really parsed in the way that browsers parse HTML. The regex just finds 
anything surrounded by angle brackets and ignores it. With the exception of the 
limited block level stuff, we don't even care if there are opening and/or 
closing tags. Yes, that can result in improperly nested stuff, but that is the 
authors fault and the parser should not bring the whole server down for that. 
The Author can (should?) preview in a browser and fix it before publishing.

However, I should point out that while the above describes most parsers (as 
most are more or less direct ports of markdown.pl - which works this way), 
there are a few that use other methods under the hood. For example, a few 
generate a parse tree which is then fed into a renderer (I believe Pandoc works 
like that, which allows it to output many more formats than just HTML), but 
they are the rare exception.

Waylan
_______________________________________________
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Reply via email to