Hi,
I turned on gprof and ran htdig to index my webserver. Of course this
is only for 140 documents right now, but I find it pretty
interesting--the pitfalls I mentioned don't even show up here--the
big slowdowns are in the word database code (and of course the
database itself). I'm going to get a more representative sample--the
entire htdig.org website including mailing list archives (5000+ for
dev.htdig.org and 9000+ for www.htdig.org).
This was with wordlist_compress turned on, compression_level: 6 and
all other attributes set to their defaults. (Yes, that means "only"
10MB of wordlist cache).
The top offenders:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
9.03 0.96 0.96 11306515 0.00 0.00 WordKey::Info(void)
7.90 1.80 0.84 16859071 0.00 0.00
WordKeyInfo::Instance(void)
7.15 2.56 0.76 __bam_cmp
7.15 3.32 0.76 __bam_search
4.89 3.84 0.52 9384984 0.00 0.00 WordKey::NFields(void)
4.89 4.36 0.52 136 3.82 39.15
HTML::parse(Retriever &, URL &)
4.89 4.88 0.52 memp_fget
3.39 5.24 0.36 memp_fput
3.10 5.57 0.33 1041808 0.00 0.00 WordKey::Clear(void)
3.01 5.89 0.32 886453 0.00 0.00 Object::~Object(void)
2.45 6.15 0.26 736310 0.00 0.00
WordKey::CopyFrom(WordKey const &)
2.16 6.38 0.23 2794518 0.00 0.00
WordKey::Set(int, unsigned int)
2.16 6.61 0.23 146334 0.00 0.04
Retriever::got_word(char const *, int, int)
2.16 6.84 0.23 __bam_c_put
1.79 7.03 0.19 2489038 0.00 0.00 WordKey::Get(int) const
-Geoff
gprof.out.gz
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.