Hi!
I've been using ht://Dig for some time, but I just joined this mailing
list today.
I have a question regarding using regular expressions in "exclude". I
searched the mailing list archives and found a conversation from August
98 where this was discussed.
Thu, 27 Aug 1998 10:06:39 -0600, Gordon Hopper ([EMAIL PROTECTED]) wrote:
>
> Maren S. Leizaola wrote:
> > I've not tried this myself but you must enter a regular
> > expression for the exclusion.
>
>
> I wondered whether it was a regular expression, because it
> doesn't say anything about it in the documentation.
>
>
> Also, I don't believe '/' is a special character in regex,
> unless it's used as the delimiter.
>
> Gordon
This would indicate that regular expressions will work. However, I could
not get it to work on version 3.0.8b2. Then I found this:
On Thu, 27 Aug 1998 14:38:23 +0200 (MET DST), J. op den Brouw
([EMAIL PROTECTED]) wrote:
>
> On Wed, 26 Aug 1998, Gordon Hopper wrote:
>
> > htdig version 3.0.8b2
> >
> > Exclude doesn't seem to work at all.
> >
> > (exclude specifies a url, right?) so something like restrict=/~
> > exclude=/~ should return nothing, right? I want to be able to
exclude
> > user home pages (which begin with a tilde) from my searches.
>
>
> Do you have a clean version og htdog 3.0.8b2? If so, the exclude
> is not working properly. There is a patch available at the
> htdig patch site (don't know it right now).
>
>
> --jesse
OK, after reading this I upgraded to version 3.1.0b4 today. "exclude"
now works, but not with regular expressions. I can get htsearch to
exclude a literal string anywhere in the URL, but it doesn't understand
regexps as far as I can tell.
Here's what I want to do:
I want to exclude all directory indices. In other words I do not want
the following document to be returned even if it does contain one or
more of the search words:
http://www.mydomain.com/archives/199808/
but I _do_ want all documents below that directory containing any of the
search words to be returned. For example, this document should be
returned:
http://www.mydomain.com/archives/199808/msg00003.html
I tried setting "exclude" to "/$" and "\/$" (the latter shouldn't really
be necessary, should it?) and ".*/$" with no effect. Directory indices
were still returned.
Now what? Clues, hints, pointers and help needed!
Gunnar
--
Gunnar Helliesen | Bergen IT Consult AS | NetBSD/VAX on a uVAX II
Systems Consultant | Bergen, Norway | '86 Jaguar Sovereign 4.2
[EMAIL PROTECTED] | http://www.bitcon.no/ | '73 Mercedes 280 (240D)
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.