Hiya,

I am having some major performance problems with htdig
and I am looking for a bit of guidence. 

I have an email archive within my company with has 
about 300 mailing lists being archived to it via 
mhonarc. We have approximatly 110k emails in the 
archive each being its own html file. Probably 
99% of these emails are just standard size 4-10k
emails. This archive has been running for 4 months
now. htddig was working great for the first month
or so.. 1-2 hours tops.. After 4 months its now 
up to 52 hours to do either an updatedig or a 
rundig. The files it is generating are only ~300 meg.
Searchs work fine (when the dig finally finishes). 
Something seems quite wrong in my mind:) 
The archive and htdig both run on the same 
system which is a dual proc sun ultra 2 using solaris 2.7 
with 1.5 gig of ram. I have apache running on the same 
host with 20+ (up to 256 max) servers started by 
default. When watching the apache logs while it 
digs.. it seems to be going awfully slow.. one 
or two querys every 30 seconds. I am running 
version 3.1.5 of htdig. The system is not taxed 
at all while the dig is going on. 

Should 110k web pages really take this long? I am 
thinking a few hours tops is more like it should 
be. 

Anyone have any ideas? I am really stumped and 
need to get this dig well below 24 hours. 

Thanks.. Mike


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to