I have an active lucene implementation that has been in place for a
couple years and was recently upgraded to the 3.02 branch. We are now
occasionally seeing documents returned from searches that should not be
returned. I have reduced the code and indexes to the smallest set
possible where I can still repeat the issue.
My test cases uses 2 indexes. These indexes have been rebuilt/optimized
using Luke 1.0.1 to make them the smallest possible. One index has 1
document, which is being returned by the query but should not. The
other index has 1000 documents, none of which match the search criteria.
The query should bring back 0 results, but brings back 1. I can zip and
mail the indexes if it would aid in helping track down this issue.
public class LuceneTest {
static public void main(String[] args) {
try {
QueryParser queryParser = new QueryParser(Version.LUCENE_30,
"author", new KeywordAnalyzer());
Query query = queryParser.parse("author:bentalcella AND NOT
publish_date:[20100601 TO 20100630]");
Searchable[] searchables = new Searchable[2];
searchables[0] = new IndexSearcher(new NIOFSDirectory(new
File("/home/dfertig/testIndexes/b1")), true);
searchables[1] = new IndexSearcher(new NIOFSDirectory(new
File("/home/dfertig/testIndexes/m1")), true);
Searcher searcher = new MultiSearcher(searchables);
System.out.println("Query: " + query.toString());
TopDocs topDocs = searcher.search(query, 10);
System.out.println("Results: " + topDocs.totalHits);
for (int in = 0; in < topDocs.totalHits; in++) {
Document document =
searcher.doc(topDocs.scoreDocs[in].doc);
System.out.println("publish_date: " +
document.get("publish_date"));
}
searcher.close();
} catch (Exception e) {
System.out.println(e.getMessage());
e.printStackTrace();
}
}
}