> Damon Tkoch wrote:
>
> Hi,
> I'm trying to use mnogosearch as a link validator for a large number
> of sites, but I ran into a serious problem.
>
> Here's my configuration, in it's simplest form:
>
> DBAddr ...
> DeleteBad no
> Index no
> CheckOnly NoMatch Regex ^http://barracuda\.enhydra\.org/.*\.html$
> Realm *
> URL http://barracuda.enhydra.org/index.html
>
> This works beautifully, checking the existance of links outside the
> barracuda.enhydra.org but not following. Except when indexer gets to
> this link, it follows it and starts indexing the other site.
>
> <A href="http://www.sys-con.com/java/readerschoice2001/">
>
> So now indexer is following through that page, all of its links, etc,
> and suddenly indexer is trying to check the whole world, ignoring the
> CheckOnly parameter.
>
> I've tried different versions of the CheckOnly, with or without regex,
> splitting it into multiple lines, etc... nothing seems to help. And
> indexer doesn't ignore the CheckOnly for all sites, just a few.
>
> Any ideas?
>
> (I first tried a Server-based method,
>
> DBAddr ..
> DeleteBad no
> Index no
> Folllow site
> Server http://barracuda.enhydra.org/index.html
>
> but this does not validate links from this site to another.)
I think it should look like this (but I didn't check):
# do not build words index
Index no
# The site itself
Server http://barracuda.enhydra.org/index.html
# Other pages referenced from the site should be checked
# but we don't want to follow futher from them
Follow no
Realm NoMatch http://barracuda.enhydra.org/*
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]