According to Joe R. Jah:
> On Tue, 7 Dec 1999, Gilles Detillieux wrote:
> > I can't understand why you didn't run into this with htdig 3.1.3 - the
> > problem definitely was there then and in previous releases. Did you
> > add .shtml/ to exclude_urls in the config for 3.1.3, but not 3.1.4?
>
> I can't understand it either. No I never had .shtml/ in my exclude_urls.
Is it possible that you were getting the extra .shtml/ stuff, but just
weren't detecting it in your searches, or are you sure they never came up?
> This brings up an interesting point: Under 3.1.3 a search of the word
> Majordomo in my site would report 28 results; under 3.1.4 without .shtml
> in exclude_urls would report 54, including the SSI mangled URL's.
> Under 3.1.4 with .shtml/ in exclude_urls it reports 47 results;-/
>
> That means 3.1.4 finds 19 more unique files than 3.1.3 in this particular
> search, one of which is the mangled .shtml file;) I jumped into
> conclusion and reported duplicates in the results;(
Where does the word appear in these 19 extra documents? If it's in img
alt text, or immediately after a bare ampersand (&), that would explain
why htdig 3.1.3 or earlier failed to index that word in these documents.
If it appears elsewhere, I'd be very curious to know why htdig 3.1.3
missed it, and if it doesn't appear anywhere in the document or in
descriptions of hyperlinks to the documents, I'd like to know why htdig
3.1.4 is putting it in the index. Please look into this further, if you
can, and get back to me ASAP. We'd like to release 3.1.4 tomorrow, but
not if it's putting incorrect entries in the index.
> > It's not included in the releases, because it's considered too much of a
> > hack, I assume. I think at one point, it was added to the 3.2 source
> > tree, but taken out again. I've ported the patch to 3.1.4, and I
> > suppose I can do likewise to 3.2.0b1 when it comes out, although you
> > should be able to do this yourself pretty easily. Just apply the code
> > to Retriever.cc, which you'll likely have to do manually for 3.2 as
> > Need2Get() has changed, then "diff -up Retriever.cc.orig Retriever.cc".
>
> Understood. I have placed it in 3.1.4 folder in the patch site.
Wow, patches to 3.1.4 before it's even released! :)
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.