Sounds like you've been tackling a number of the
issues I was concerned with "fuzzy" searching. It's
essentially the same problem - the user types one word
and the engine searches for several variants.

The FuzzyLikeThisQuery class in the "queries" module
of the contrib area in SVN contains similar code. It
addresses idf and coord issues introduced with fuzzy
variants. 

It's probably worth considering having one
implementation for generically scoring variants
whether they are produced by fuzzy algorithms or
synonyms or any other means. In either case there
could be a "cost" factor associated with variants
which could be based on the fuzzy edit distance from
the root term or synonym "relatedness" to the root
term.

I'll have a look at your implementation with this in
mind when I have a bit more time.

Cheers,
Mark



                
___________________________________________________________ 
To help you stay safe and secure online, we've developed the all new Yahoo! 
Security Centre. http://uk.security.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to