Hi!

I've been using ht://Dig for some time, but I just joined this mailing
list today.

I have a question regarding using regular expressions in "exclude". I
searched the mailing list archives and found a conversation from August
98 where this was discussed.


Thu, 27 Aug 1998 10:06:39 -0600, Gordon Hopper ([EMAIL PROTECTED]) wrote:
> 
> Maren S. Leizaola wrote: 
> > I've not tried this myself but you must enter a regular 
> > expression for the exclusion. 
> 
> 
> I wondered whether it was a regular expression, because it 
> doesn't say anything about it in the documentation. 
> 
> 
> Also, I don't believe '/' is a special character in regex, 
> unless it's used as the delimiter. 
> 
> Gordon 


This would indicate that regular expressions will work. However, I could
not get it to work on version 3.0.8b2. Then I found this:


On Thu, 27 Aug 1998 14:38:23 +0200 (MET DST), J. op den Brouw
([EMAIL PROTECTED]) wrote:
> 
> On Wed, 26 Aug 1998, Gordon Hopper wrote: 
> 
> > htdig version 3.0.8b2 
> > 
> > Exclude doesn't seem to work at all. 
> > 
> > (exclude specifies a url, right?) so something like restrict=/~ 
> > exclude=/~ should return nothing, right? I want to be able to
exclude 
> > user home pages (which begin with a tilde) from my searches. 
> 
> 
> Do you have a clean version og htdog 3.0.8b2? If so, the exclude 
> is not working properly. There is a patch available at the 
> htdig patch site (don't know it right now). 
> 
> 
> --jesse 


OK, after reading this I upgraded to version 3.1.0b4 today. "exclude"
now works, but not with regular expressions. I can get htsearch to
exclude a literal string anywhere in the URL, but it doesn't understand
regexps as far as I can tell.

Here's what I want to do:

I want to exclude all directory indices. In other words I do not want
the following document to be returned even if it does contain one or
more of the search words:

http://www.mydomain.com/archives/199808/

but I _do_ want all documents below that directory containing any of the
search words to be returned. For example, this document should be
returned:

http://www.mydomain.com/archives/199808/msg00003.html

I tried setting "exclude" to "/$" and "\/$" (the latter shouldn't really
be necessary, should it?) and ".*/$" with no effect. Directory indices
were still returned.

Now what? Clues, hints, pointers and help needed!

Gunnar

--
Gunnar Helliesen   | Bergen IT Consult AS  | NetBSD/VAX on a uVAX II
Systems Consultant | Bergen, Norway        | '86 Jaguar Sovereign 4.2
[EMAIL PROTECTED]   | http://www.bitcon.no/ | '73 Mercedes 280 (240D)
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to