I just migrated on ton of code from Lucene 4.10 to 5.4. Lots of custom collectors, analyzers, queries, etc.. I have migrated other code bases from Lucene before (2->3, 3->4) and I always had one issue I could not eyeball!
When using a custom query, I get the same document twice in the result set. The changes I made for the upgrade had to do with the query/weight API change. Without getting in the custom code, here is the simple test case: @BeforeClass public static void buildIndex() throws IOException { ANALYZER = new StandardAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(ANALYZER); DIRECTORY = new RAMDirectory(); try (IndexWriter writer = new IndexWriter(DIRECTORY, config)) { // removed for brevity // repeated five times with different values Document doc = new Document(); doc.add(...); writer.addDocument(doc); } } @Test public void testQuery() throws IOException { try (IndexReader reader = DirectoryReader.open(DIRECTORY)) { IndexSearcher searcher = new IndexSearcher(reader); PriorityQuery query = new PriorityQuery(); query.add(new TermQuery(new Term("foo", "xyz"))); query.add(new TermQuery(new Term("bar", "xyz"))); query.add(new TermQuery(new Term("baz", "xyz"))); CheckHits.checkDocIds("Invalid docs", new int[] {4, 2, 0, 3}, result.scoreDocs); } There should be four unique results out of five since the second document (docId 1) does not contain the term xyz. The results instead contain 5 documents, with the first one repeated twice at the start: [doc=4 score=1.1976817 shardIndex=0, doc=4 score=1.1976817 shardIndex=0, doc=2 score=0.63170385 shardIndex=0, doc=0 score=0.37223506 shardIndex=0, doc=3 score=0.34156355 shardIndex=0] When using a BooleanQuery, the results are correct, so obviously the custom Query is failing somehow. In all my years of Lucene, I never had the same document twice. :) Without boring everyone with the custom code, what should I be looking for? Just cannot quite spot it. Cheers, Ivan