Well, I have some rough numbers taken back in 1998, so they are from the
HTML::Parser version from that time.

HTML document size 116,997 bytes
3398 HREFs (plus other tags which weren't of concern to my numbers at the
time)
SPARC Ultra 2, 2x167MHz

One pass through the document with HTML::Parser ~40 seconds
One pass through with home-grown parser ~ 18 seconds

Sorry these numbers aren't more accurate, but we weren't doing serious
benchmarking recording...

Regards,
Christian

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Dave Hodgkinson
> Sent: Thursday, November 11, 1999 4:31 AM
> To: Gisle Aas; 'ModPerl'
> Subject: Re: Trying not to re-invent the wheel
>
>
>
> Gisle Aas <[EMAIL PROTECTED]> writes:
>
> >
> >
> > --=-=-=
> >
> > "Christian Gilmore" <[EMAIL PROTECTED]> writes:
> >
> > > I found that writing my own parser to fit my specific need was far
> > > and away the fastest thing I could do. It really depends upon your
> > > specific application. HTML::Parser is nice if you want to see the
> > > structure of the document your parsing but is just too slow to use
> > > for wresting particular tags from a document...
> >
> > True. This was the main reason I started work on a new XS based
> > HTML::Parser a week ago.  It should make much of the performance
> > argument go away.  Still, most of the HTML that I have ever
> needed to
> > parse or manipulate is regular enough to make perl REs good enough.
>
> Do you have any numbers on speed?
>
> Ta,
>
> Dave
>
>
> --
> David Hodgkinson, Technical Director, Sift PLC
> http://www.sift.co.uk
> Editor, "The Highway Star"
> http://www.deep-purple.com
> Dave endorses Yanagisawa saxes, Apache, Perl, Linux, MySQL,
> emacs, gnus
>

Reply via email to