According to Malcolm Austen:
> On Fri, 4 May 2001, ROLLE WALRAVEN wrote:
> + Hi guys, i wanted to know whether the limit_urls_to supported some sort of
> + "regular expression" type value. I am using htdig on a large web portal and
> + people have written their links between www.xyz.com and xyz.com depending on
> + department standard ... the problem is I can't use either one without one or the
> + other being excluded ... basically is there a way to say "limit_urls_to:
> + www.xyz.com OR xyz.com"? Any other suggestions for using htdig on a large site?
>
> Limit_urls_to takes a list of strings, not just one string but in your
> case you only need
>
> limit_urls_to: xyz.com
>
> since www.xyz.com matches to the the string and will be indexed
This is actually a fairly simple case, where one substring will match
all the domains required. In more complex cases, you can specify several
substrings, separated by spaces, and a match of any single substring will
cause the URL to be allowed.
It's actually pretty rare that you'd need to resort to regular expressions,
but for those cases, they are supported in the 3.2.0b3 beta.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html