Here is the complete code: https://gist.github.com/brusic/e3018a2e403f5707fa3e
The code is not originally mine, so I do not take responsibility. Once I get things to perform correctly, I will do another pass with improvements. Much of the custom code needs to be re-thought. The scorer is one class that I did not need to update, so I did not focus on it. Will do so now. Ivan On Mon, Dec 28, 2015 at 4:58 PM, Adrien Grand <jpou...@gmail.com> wrote: > Hi Ivan, > > It looks like your scorer is emitting the same document twice. Maybe you > could try to use AssertingIndexSearcher in your test case, this is the kind > of things that it should catch. > > The only related Lucene 5 change that I can think of is that Lucene now > requires docs to be collected in order, did this scorer use to collect docs > out of order in Lucene 4? > > If that still doesn't help and if you can share the code of your scorer, I > could give it a quick look. > > Le lun. 28 déc. 2015 à 22:18, Ivan Brusic <i...@brusic.com> a écrit : > > > I just migrated on ton of code from Lucene 4.10 to 5.4. Lots of custom > > collectors, analyzers, queries, etc.. I have migrated other code bases > from > > Lucene before (2->3, 3->4) and I always had one issue I could not > eyeball! > > > > When using a custom query, I get the same document twice in the result > set. > > The changes I made for the upgrade had to do with the query/weight API > > change. > > > > Without getting in the custom code, here is the simple test case: > > > > @BeforeClass > > public static void buildIndex() throws IOException { > > ANALYZER = new StandardAnalyzer(); > > IndexWriterConfig config = new IndexWriterConfig(ANALYZER); > > DIRECTORY = new RAMDirectory(); > > try (IndexWriter writer = new IndexWriter(DIRECTORY, config)) { > > // removed for brevity > > // repeated five times with different values > > Document doc = new Document(); > > doc.add(...); > > writer.addDocument(doc); > > } > > } > > > > @Test > > public void testQuery() throws IOException { > > try (IndexReader reader = DirectoryReader.open(DIRECTORY)) { > > IndexSearcher searcher = new IndexSearcher(reader); > > > > PriorityQuery query = new PriorityQuery(); > > query.add(new TermQuery(new Term("foo", "xyz"))); > > query.add(new TermQuery(new Term("bar", "xyz"))); > > query.add(new TermQuery(new Term("baz", "xyz"))); > > > > CheckHits.checkDocIds("Invalid docs", new int[] {4, 2, 0, 3}, > > result.scoreDocs); > > > > } > > > > There should be four unique results out of five since the second > > document (docId 1) does not contain the term xyz. The results instead > > contain 5 documents, with the first one repeated twice at the start: > > > > [doc=4 score=1.1976817 shardIndex=0, doc=4 score=1.1976817 > > shardIndex=0, doc=2 score=0.63170385 shardIndex=0, doc=0 > > score=0.37223506 shardIndex=0, doc=3 score=0.34156355 shardIndex=0] > > > > When using a BooleanQuery, the results are correct, so obviously the > > custom Query is failing somehow. In all my years of Lucene, I never > > had the same document twice. :) Without boring everyone with the > > custom code, what should I be looking for? Just cannot quite spot it. > > > > Cheers, > > > > Ivan > > >