On Sat, 2004-01-24 at 03:23, Lachlan Andrew wrote: > Given that several people commented on how slow digging now is, should > we look at reducing the feature set for future releases? That will: > a) Increase the speed > b) Reduce memory consumption > c) Reduce the testing required > d) Reduce the amount of code which must be maintained > e) Reduce the potential for conflicts of combinations of features.
Well, as I'm finishing up our new search tool, I just did my first index over http today (the majority of what I was working on involved indexing small local files). I was surprised at how slow the spidering/indexing really was (well, still is, it hasn't finished yet). It has taken about 11 hours to index 10k pages so far. In my last dig under 3.1.6, I did 30k+ pages in 1 hour and 41 minutes! So, I'm starting to suspect that I did something really wrong and/or my config is broken. I noticed that it was slow right away and tried adding 'wordlist_cache_size: 100000000' to my config (although, if I was slow out of the gate, caching shouldn't have that much of a performance increase). This is currently running on a RedHat 8.0 machine. I'm going to test again later tonight under RedHat Enterprise AS 3.0 (including a re-compile). Any quick tips/optimizations that anyone can think I might try before I continue? Cheers, Chris -- Christopher Murtagh Enterprise Systems Administrator ISR / Web Communications Group McGill University Montreal, Quebec Canada Tel.: (514) 398-3122 Fax: (514) 398-2017 ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev
