[htdig-dev] profiles and unicode

Lachlan Andrew Sun, 25 Apr 2004 06:26:34 -0700

Greetings all,

When I profile the code, I get quite different results from Joe, 
presumably because I don't use an external parser, and my host is 
in /etc/hosts.


1. Could anyone confirm that  gethostbyname  is really as expensive as 
Joe's profile suggests?  If so, I'll write a cache for it.  The 
profile looks a bit suspect there, because gethostbyname is reported 
as only being called a handful of times...

2. On my system, about 50% of the time is spent in  HTML::parse().  It 
looks ripe for optimisation.  In particular, does anyone know why two 
passes are made through the document?  The first just seems to strip 
comments/noindex and decode SGML tags.  If I optimise this, the most 
efficient way would be assuming 8 bit characters (include UTF-8).  
Are we still planning to make ht://Dig unicode compliant?  If so, do 
we plan to use wide characters or UTF-8?

Cheers,
Lachlan

-- 
[EMAIL PROTECTED]
ht://Dig developer DownUnder  (http://www.htdig.org)



-------------------------------------------------------
This SF.net email is sponsored by: The Robotic Monkeys at ThinkGeek
For a limited time only, get FREE Ground shipping on all orders of $35
or more. Hurry up and shop folks, this offer expires April 30th!
http://www.thinkgeek.com/freeshipping/?cpg=12297
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

[htdig-dev] profiles and unicode

Reply via email to