On Tue, 7 Dec 1999, Geoff Hutchison wrote:
> Date: Tue, 7 Dec 1999 14:39:30 -0600
> From: Geoff Hutchison <[EMAIL PROTECTED]>
> To: "Joe R. Jah" <[EMAIL PROTECTED]>
> Cc: htdig3-dev <[EMAIL PROTECTED]>
> Subject: Re: [htdig3-dev] Re: htdig-3.1.4 prerelease
>
> At 12:05 PM -0800 12/7/99, Joe R. Jah wrote:
> >everything worked except my the old local duplicate suppressor patch:
> >ftp://sol.ccsf.cc.ca.us/htdig-patches/3.0.8b2/Retriever.cc.0
> >did not quite do its job.
>
> It would probably need some tinkering to work. We changed how local
> documents are indexed slightly, so it would need to be "ported."
All the tinkering I did was in Retriever::Need2Get(char *u)
I applied the old patch and changed:
String *local_filename = IsLocal(u);
to
String *local_filename = GetLocal(u);
and added
url.lowercase();
which was missing in 3.1.4.
What other changes need to be made?
> >As you see database sizes do not vary too much, but the results pages
> >point to the same URL MULTIPLE times in 3.1.4 case; baffling;-/?
>
> You mean something with exactly the same string? Can you give us an example?
Here are some examples of the results URLs:
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml/f96_p6.shtml
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml/
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml/f96_p4.shtml
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml/f96_p5.shtml
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml/int_fall96.shtml
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml/f96_p2.shtml
http://www.ccsf.cc.ca.us/Resources/Title3/intech/f96_p4.shtml/f96_p3.shtml
As you can see all those point to f96_p4.shtml file, but many results have
extra garbage appended to the file name.
> >That reminds me; has the _promised_ duplicate suppression feature been
> >placed in 3.2.x yet?
>
> Alas no. Some of you may remember the post to the htdig list about 2
> months ago from someone saying they were working on a number of
> projects (including duplicate elimination). Alas, they seem to have
> disappeared again. Hence, 3.2.0b1 will go out the door without it.
>
> However, that doesn't mean it's dead yet. ;-)
That statement doesn't warm my heart very much;) I guess I'll have to
live with the above tinkering for the foreseeable future. Would you
consider porting this little patch with future release, even though you
wouldn't include it in the release. I am sure there are other users who
would appreciate to have at least local duplicate suppression.
Best regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED]
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.