I suggest that you also choose a small value for max_hop_count, as you
probably don't need a complete index of all those 950 sites,  but only their
principle pages.

For out "local community" index we use a max_hop_count as low as 1 to index
just the home pages of ~1000 servers.  We do a second index with a
max_hop_count of 2 for a slightly deeper index of a choosen ~250 sites and
then merge the two.

--
David Adams
Computing Services
Southampton University


----- Original Message -----
From: "Clint Gilders" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, February 09, 2001 4:25 PM
Subject: Re: [htdig] saving data from looping dig that I killed


> > Given (b), I'd just say you should reindex, being careful to stay away
> > from nasty CGIs. A dig of 950 URLs should not take very long at all.
>
> I have added in several exclude_urls: and I'll give re-indexing a try.
>
> It's not actually 950 urls it is 950 different sites that have asked to
> be in our search engine so there are thousands and thousands of
> individual URLS
>
> Thanks for the help
> Clint
>
> --
> Clint Gilders
> Servermaster Onlinehobbyist Inc.
> [EMAIL PROTECTED]
>
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> Information: http://lists.sourceforge.net/lists/listinfo/htdig-general
> FAQ: http://htdig.sourceforge.net/FAQ.html
>


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
Information: http://lists.sourceforge.net/lists/listinfo/htdig-general
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to