Good point about FuzzyQuery... it has already mostly solved the "too
many clauses" thing anyway.  I also think the idf should go.

There are two different usecases:
 1) relevancy: give highest relevance and closest matches, but I don't
care if I get 100% of the matches.
 2) matching: must give all matches, but we don't really care about
relevance (it's more like a filter).

Range queries are normally only used in case 2
Prefix queries are used in both cases, but since stemming handles word
variants, I think more people use them for case 2
Fuzzy queries are normally only used in case 1 I think.

Now it gets a little confusing because queries in the "matching"
category may still rely on the field boost (but probably don't want
any other relevancy factor).  An example of this is boosting more
recent documents when building an index.  There are alternate ways to
solve this (some of them more flexible, like the FunctionQuery I'm
refactoring now).

I'd still argue for making ConstantScoreRangeQuery the default of the
QueryParser.


On 11/15/05, mark harwood <[EMAIL PROTECTED]> wrote:
> > That would use more memory, but still permit ranked
> > searches.  Worth it?
>
> Not sure. I expect FuzzyQuery results would suffer if
> the edit distance could no longer be factored in. At
> least there's a quality threshold to limit the more
> tenuous matches but all matches below the threshold
> would be scored equally. I've certainly had no use for
> the idf in fuzzy queries (it favours rare
> mis-spellings) so happy to see that go.  I'm not sure
> what the lack of edit-distance would do without seeing
> some examples results.

-Yonik
Now hiring -- http://forms.cnet.com/slink?231706

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to