Re: Benchmarks with TextMate's manual

John Fraser Mon, 27 Aug 2007 17:44:01 -0700

On 8/27/07, Andrea Censi <[EMAIL PROTECTED]> wrote:
> Maruku takes 8 seconds for parsing (on my PowerBook G4 1.5GHz).
> (please note that Ruby, per se, is much slower than Perl)
>
> I guess that if you plot [time for parsing] versus [length of the
> document], you get a curve which grows more than linearly for
> Markdown.pl and PHP Markdown.
>
> This is the same behaviour that I observed in Bluecloth (straight port
> of Markdown.pl in Ruby) -- if I remember well, time was O(length^2).
> By comparison, Maruku, and other real parsers, takes O(length).


One implementation of the current Markdown algorithm can actually beat
Maruku's time: [Showdown] is a straight JavaScript port of
Markdown.pl, but it converts the TextMate in under 4 seconds (in
Firefox, on an ancient machine).

  [Showdown]: http://www.attacklab.net/showdown-gui.html

John Gruber's focus in the reference implementation has been on
functionality -- not speed.  That's good news, because it means
there's a lot of room for improvement.  I didn't do any clever
optimizations with Showdown; I just benchmarked, then fixed the
glaring problems. It's not so easy to speed up the Perl version right
now, because the dog-slow text::balanced parser would eclipse any
simple fixes.  But the PHP port shouldn't be hard too speed up.  In
fact, a good band-aid for the Perl version might be to backport
Michael's variant of the old HTML parsing code -- which builds a big
regexp to handle balanced tags up to four levels deep.  (My gut tells
me we can write a simple HTML parser that's even faster than that, and
I'll take a crack at it as soon as I find the time.)  Once you have a
fast HTML parser, it's easy to optimize the rest.

So, big-O aside, Markdown's current algorithm can perform a lot better
in the real world than the Perl and PHP implementations indicate; in
fact, it sounds like it can keep up with Maruku on reasonably-sized
documents.  But don't get me wrong; I'm actually strongly in favor of
switching to a traditional parsing model -- even though I'd have to
throw out all the work I did porting the current algorithm to
JavaScript.  I'd originally planned to write a Markdown parser from
scratch, but quickly realized it would be impossible to get 100%
compatibility without doing a direct, line-by-line port.  And that's a
bad sign.

John Fraser
_______________________________________________
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Benchmarks with TextMate's manual

Reply via email to