Greetings,
Preface:
I've been using htdig on two sites (http://www.brama.com/ and http://postup.brama.com) for the last four years with considerable success. Back in 2000 our gowal was to make sure that we could do searches in Ukrainian CP1251 encoding as well as English language results.
Basic specs of the machine were P3-800MHZ CPU, 500MB RAM, ~14 GB for web space and htdig databases. We used htdig 3.1.5. and Apache 1.3.2x
In the last month we've upgraded both the machine (P4 2.4GHZ, 1GB RAM, even more disk space, very little of it used) and the distro (from RH6.2 to Fedora Core 2). That means going from htdig 3.1.5 to 3.2.0b5-7 (both the htdig and htdig-web RPM) and Apache 2.0.50.
The process ruequired some re-learning but well worth it. After reviewing several documents we've managed to build a searchable database.
I have several questions, but I'll send them out in several emails
Here's the first:
Our primary site comes in at ~ 65,000 indexable items. Under htdig 3.1.5 the site could be indexed in an hour or two, depending on the load imposed by other services.
The speed of indexing (using htdig) in 3.2.0b5 is substantially affected by the setting of 'word_cache_size'. I started and stopped it several times making large increases, going from 10000000 to 40000000 to 80000000. Finally, letting htdig fly it took ~6 1/2 hours. With word_cache_size set to 80000000 it flew until about 20,000 items were indexed and then the speed began to deteriorate.
So, how high can word_cache_size size be set w/o red-lining?
Thanks!
Max Pyziur [EMAIL PROTECTED]
------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

