Hello,
We are re currently migrating from 2.4.1 to 2.9.0. We've noticed some
changes in the results of fuzzy queries.
We have made this small test case :
********
StandardAnalyzer analyzer = new StandardAnalyzer();
Directory index = new RAMDirectory();
IndexWriter w = new IndexWriter(index, analyzer, true,
IndexWriter.MaxFieldLength.UNLIMITED);
addDoc(w, "Lucene in Action");
addDoc(w, "Lucene for Dummies");
addDoc(w, "Giga byte");
addDoc(w, "ManagingGigabytesManagingGigabyte");
addDoc(w, "ManagingGigabytesManagingGigabytes");
addDoc(w, "The Art of Computer Science");
addDoc(w, "J. K. Rowling");
addDoc(w, "JK Rowling");
addDoc(w, "Joanne K Roling");
addDoc(w, "Bruce Willis");
addDoc(w, "Willis bruce");
addDoc(w, "Brute willis");
addDoc(w, "B. willis");
w.close();
***************
Here's the problem :
We would expect the query
Query q = new QueryParser("title", analyzer).parse( "giga~0.9" );
to match at least "Giga byte".
With lucene version 2.4.1 it returns :
1. Giga byte with score : 1.7948763
With 2.9, there's no matches, we have to go something as low as 0.7
("giga~0.7") to get some matches.
Could this be a regression?
http://www.nabble.com/file/p25924689/FirstShot.java Simple test case (1 file
here)
--
View this message in context:
http://www.nabble.com/Difference-between-2.4.1-and-2.9.0-%28possible-regression-%29-tp25924689p25924689.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]