On Tue, 15 Jan 2002, Gilles Detillieux wrote:

> > Would it be possible to remove duplicate pages from the search results 
> > before they are output to the html page?  This is obviously something that 
> > htmerge would do is the databases were to be combined into one.
> 
> Good point.  This would appear to be a bug in the collections support
> (not the only one!).  This should go on our to-do list for 3.2, until
> we can get back to actively developing it.

Yes, as Greg discovered, the collections code just loops over the
databases--it makes no attempt to check that URLs aren't duplicates. While
it will take some work for this, culling the duplicates obviously speeds
up the results (if you do it at the right point, you won't need to score,
etc.).

-Geoff


_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to