Thanks to some of the answers to my question below. But I'm still not clear 
on something. I attempted to index a remote site, in this case Lotus.com. 
Now, I have no idea how many pages that is. But I let the index process run 
for three days and by the end of three days, Linux was page-swapping like a 
banshee and was becoming substantially unresponsive. Given that that was 
only one site, and I'm thinking about indexing a lot more, I've been trying 
to figure out what I need to do to make the hardware/software able to 
handle it. Right now, I'm thinking the process is too big. Can htdig and/or 
htmerge running on a 258MB or 384MB machine handle indexing/merging sites 
like lotus.com or other large sites, or is this beyond the scope of this 
tool? And, if we don't know the size of external sites, how can I go about 
thinking through this issue?

-- DG


 >So, I'm finishing up pre-deployment testing and I seem to have run into
 >limits of the system. I'm running htdig on a 256MB PIII, Mandrake 7.2
 >system. When I just index our own sites, digging is fast and the system
 >seems quite responsive. But, ideally, I'd like to dig 40-60 sites per topic
 >(say, Lotus Domino sites) and then maybe 3 or more topics. But it seems
 >that although this box has a large amount of RAM (it maxes at 384M) and a
 >40GB disk, the digging process is just too memory intensive and eveything
 >slows down to a crawl.
 >
 >So, here's question: can I index large sites (like, say, lotus.com)? Or are
 >we just going to run into machine limits and I'm best off using htdig for
 >my own sites and leave the dream of indexing outside sites to a later 
project?
 >
 >If I'm missing something, or their's an ideal configuration for attempting
 >this approach, please enlighten me.
 >
 >Thanks!
 >


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to