On Fri, 18 Oct 2002 at 11:09:43 -0500, Searcher wrote:

> >../index -N 80 -R 64
> >
> >will index handle all this? Will index eat up all the available
> >memory (2GB) trying to load all these URLs in memory? I've had problems with
> 
> My first run ever with aspseek was 'index -N 100' and it lasted three days then 
> died. It had plenty of memory yet I can't see what died. I then ran 'index -D' 
> and once done, restarted with 'index -N 25' and it's been running non stop for 
> over a week now with 25GB's of indexed materials now.

Kir's question has a point.  How many individual sites are you indexing?
Are your 10,000+ URLs individual sites or are they URLs of a single site.

Actually, Kir, I've been working on tidying up mutex locking arround
calls to GetNextLink() as on larger databases (~ 20,000,000 URLs) it
seems index can get locked in queueing mode for significant periods of
time (depending on the distribution or URLs over time).  I now lock
within AddUrls() arround the iteration through the CIntSet of URLs to
go into the queue.  It provides a window where idle threads can pop
the next document regardless of the number of queued sites and seems
to help quite a bit.


Matt.

Reply via email to