> Damon Tkoch wrote:
> 
> Hi,
> I'm trying to use mnogosearch as a link validator for a large number
> of sites, but I ran into a serious problem.
> 
> Here's my configuration, in it's simplest form:
> 
> DBAddr ...
> DeleteBad no
> Index no
> CheckOnly NoMatch Regex ^http://barracuda\.enhydra\.org/.*\.html$
> Realm *
> URL http://barracuda.enhydra.org/index.html
> 
> This works beautifully, checking the existance of links outside the
> barracuda.enhydra.org but not following.  Except when indexer gets to
> this link, it follows it and starts indexing the other site.
> 
> <A href="http://www.sys-con.com/java/readerschoice2001/">
> 
> So now indexer is following through that page, all of its links, etc,
> and suddenly indexer is trying to check the whole world, ignoring the
> CheckOnly parameter.
> 
> I've tried different versions of the CheckOnly, with or without regex,
> splitting it into multiple lines, etc... nothing seems to help.  And
> indexer doesn't ignore the CheckOnly for all sites, just a few.
> 
> Any ideas?
> 
> (I first tried a Server-based method,
> 
> DBAddr ..
> DeleteBad no
> Index no
> Folllow site
> Server http://barracuda.enhydra.org/index.html
> 
> but this does not validate links from this site to another.)


I think it should look like this (but I didn't check):



# do not build words index
Index no

# The site itself
Server http://barracuda.enhydra.org/index.html

# Other pages referenced from the site should be checked
# but we don't want to follow futher from them

Follow no
Realm NoMatch http://barracuda.enhydra.org/*
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to