On 8/27/07, Andrea Censi <[EMAIL PROTECTED]> wrote: > Maruku takes 8 seconds for parsing (on my PowerBook G4 1.5GHz). > (please note that Ruby, per se, is much slower than Perl) > > I guess that if you plot [time for parsing] versus [length of the > document], you get a curve which grows more than linearly for > Markdown.pl and PHP Markdown. > > This is the same behaviour that I observed in Bluecloth (straight port > of Markdown.pl in Ruby) -- if I remember well, time was O(length^2). > By comparison, Maruku, and other real parsers, takes O(length).
One implementation of the current Markdown algorithm can actually beat Maruku's time: [Showdown] is a straight JavaScript port of Markdown.pl, but it converts the TextMate in under 4 seconds (in Firefox, on an ancient machine). [Showdown]: http://www.attacklab.net/showdown-gui.html John Gruber's focus in the reference implementation has been on functionality -- not speed. That's good news, because it means there's a lot of room for improvement. I didn't do any clever optimizations with Showdown; I just benchmarked, then fixed the glaring problems. It's not so easy to speed up the Perl version right now, because the dog-slow text::balanced parser would eclipse any simple fixes. But the PHP port shouldn't be hard too speed up. In fact, a good band-aid for the Perl version might be to backport Michael's variant of the old HTML parsing code -- which builds a big regexp to handle balanced tags up to four levels deep. (My gut tells me we can write a simple HTML parser that's even faster than that, and I'll take a crack at it as soon as I find the time.) Once you have a fast HTML parser, it's easy to optimize the rest. So, big-O aside, Markdown's current algorithm can perform a lot better in the real world than the Perl and PHP implementations indicate; in fact, it sounds like it can keep up with Maruku on reasonably-sized documents. But don't get me wrong; I'm actually strongly in favor of switching to a traditional parsing model -- even though I'd have to throw out all the work I did porting the current algorithm to JavaScript. I'd originally planned to write a Markdown parser from scratch, but quickly realized it would be impossible to get 100% compatibility without doing a direct, line-by-line port. And that's a bad sign. John Fraser _______________________________________________ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss