add c# version of FuzzyLikeThisQuery.java to contrib section
------------------------------------------------------------

                 Key: LUCENENET-23
                 URL: http://issues.apache.org/jira/browse/LUCENENET-23
             Project: Lucene.Net
          Issue Type: New Feature
         Environment: n/a
            Reporter: Marco Dissel


I've converted the FuzzeLikeThisQuery.java to c#... Maybe George can add this 
to the contrib section?

original file is stored at :
http://svn.apache.org/viewvc/lucene/java/trunk/contrib/queries/src/java/org/apache/lucene/search/FuzzyLikeThisQuery.java?revision=413732&view=markup

Fuzzifies ALL terms provided as strings and then picks the best n 
differentiating terms.
In effect this mixes the behaviour of FuzzyQuery and MoreLikeThis but with 
special consideration of fuzzy scoring factors.
This generally produces good results for queries where users may provide 
details in a number of  fields and have no knowledge of boolean query syntax 
and also want a degree of fuzzy matching and
a fast query.

For each source term the fuzzy variants are held in a BooleanQuery with no 
coord factor (because we are not looking for matches on multiple variants in 
any one doc). Additionally, a specialized
TermQuery is used for variants and does not use that variant term's IDF because 
this would favour rarer terms eg misspellings. Instead, all variants use the 
same IDF ranking (the one for the source query  term) and this is factored into 
the variant's boost. If the source query term does not exist in the index the 
average IDF of the variants is used. @author maharwood

ps. there's no java test class...

Thanks

Marco

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to