On Sat, 2004-01-24 at 03:23, Lachlan Andrew wrote:
> Given that several people commented on how slow digging now is, should 
> we look at reducing the feature set for future releases?  That will:
> a) Increase the speed
> b) Reduce memory consumption
> c) Reduce the testing required
> d) Reduce the amount of code which must be maintained
> e) Reduce the potential for conflicts of combinations of features.

 Well, as I'm finishing up our new search tool, I just did my first
index over http today (the majority of what I was working on involved
indexing small local files). I was surprised at how slow the
spidering/indexing really was (well, still is, it hasn't finished yet).

 It has taken about 11 hours to index 10k pages so far. In my last dig
under 3.1.6, I did 30k+ pages in 1 hour and 41 minutes! So, I'm starting
to suspect that I did something really wrong and/or my config is broken.
I noticed that it was slow right away and tried adding
'wordlist_cache_size: 100000000' to my config (although, if I was slow
out of the gate, caching shouldn't have that much of a performance
increase).

This is currently running on a RedHat 8.0 machine. I'm going to test
again later tonight under RedHat Enterprise AS 3.0 (including a
re-compile). Any quick tips/optimizations that anyone can think I might
try before I continue?

Cheers,

Chris

-- 
Christopher Murtagh
Enterprise Systems Administrator
ISR / Web Communications Group 
McGill University
Montreal, Quebec
Canada

Tel.: (514) 398-3122
Fax:  (514) 398-2017


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to