Hi,

> >Looking at documentation, it does not appear that there is any option in
> >either the conf file or the parameters passed to htsearch, to limit the
> >number of matches which are located and sorted.  If "several thousand"
> >documents match the specified words, all of these have to participate in
> >sorting; there's no way to limit the number which participate.
> 
> This has been requested in the past. The biggest problem is that it's
> a bit of a chicken-and-egg problem. You want to cut out the documents
> before scoring and sorting (preferably before even looking them up in
> the document DB). But before you have a ranking, you don't know which
> ones you want to cut exactly. After all, you don't want to cut out
> the best-ranked documents!
But for single word searches one could sort the documents by score at
the digging. The B+Tree retrieval method on the words database would
then yield very fast the best results. As Berkeley DB gives you the
possibility to define your own sorting criteria (just a function) this
should be fairly easy to implement. (One needs to define DUP and
DUP_SORT)
I am not quite sure how this would help for multiword searches.
Any thoughts about this? 

Yours, mentos 

--
Mentos Hoffmann, Roonstr.17, D-76137 Karlsruhe, Germany
email: [EMAIL PROTECTED]

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to