Ivan, I can't find the BaseScorer class in the gist. Maybe you forgot to git add it?
Le lun. 28 déc. 2015 à 23:07, Ivan Brusic <i...@brusic.com> a écrit : > Here is the complete code: > https://gist.github.com/brusic/e3018a2e403f5707fa3e > > The code is not originally mine, so I do not take responsibility. Once I > get things to perform correctly, I will do another pass with improvements. > Much of the custom code needs to be re-thought. > > The scorer is one class that I did not need to update, so I did not focus > on it. Will do so now. > > Ivan > > On Mon, Dec 28, 2015 at 4:58 PM, Adrien Grand <jpou...@gmail.com> wrote: > > > Hi Ivan, > > > > It looks like your scorer is emitting the same document twice. Maybe you > > could try to use AssertingIndexSearcher in your test case, this is the > kind > > of things that it should catch. > > > > The only related Lucene 5 change that I can think of is that Lucene now > > requires docs to be collected in order, did this scorer use to collect > docs > > out of order in Lucene 4? > > > > If that still doesn't help and if you can share the code of your scorer, > I > > could give it a quick look. > > > > Le lun. 28 déc. 2015 à 22:18, Ivan Brusic <i...@brusic.com> a écrit : > > > > > I just migrated on ton of code from Lucene 4.10 to 5.4. Lots of custom > > > collectors, analyzers, queries, etc.. I have migrated other code bases > > from > > > Lucene before (2->3, 3->4) and I always had one issue I could not > > eyeball! > > > > > > When using a custom query, I get the same document twice in the result > > set. > > > The changes I made for the upgrade had to do with the query/weight API > > > change. > > > > > > Without getting in the custom code, here is the simple test case: > > > > > > @BeforeClass > > > public static void buildIndex() throws IOException { > > > ANALYZER = new StandardAnalyzer(); > > > IndexWriterConfig config = new IndexWriterConfig(ANALYZER); > > > DIRECTORY = new RAMDirectory(); > > > try (IndexWriter writer = new IndexWriter(DIRECTORY, config)) { > > > // removed for brevity > > > // repeated five times with different values > > > Document doc = new Document(); > > > doc.add(...); > > > writer.addDocument(doc); > > > } > > > } > > > > > > @Test > > > public void testQuery() throws IOException { > > > try (IndexReader reader = DirectoryReader.open(DIRECTORY)) { > > > IndexSearcher searcher = new IndexSearcher(reader); > > > > > > PriorityQuery query = new PriorityQuery(); > > > query.add(new TermQuery(new Term("foo", "xyz"))); > > > query.add(new TermQuery(new Term("bar", "xyz"))); > > > query.add(new TermQuery(new Term("baz", "xyz"))); > > > > > > CheckHits.checkDocIds("Invalid docs", new int[] {4, 2, 0, 3}, > > > result.scoreDocs); > > > > > > } > > > > > > There should be four unique results out of five since the second > > > document (docId 1) does not contain the term xyz. The results instead > > > contain 5 documents, with the first one repeated twice at the start: > > > > > > [doc=4 score=1.1976817 shardIndex=0, doc=4 score=1.1976817 > > > shardIndex=0, doc=2 score=0.63170385 shardIndex=0, doc=0 > > > score=0.37223506 shardIndex=0, doc=3 score=0.34156355 shardIndex=0] > > > > > > When using a BooleanQuery, the results are correct, so obviously the > > > custom Query is failing somehow. In all my years of Lucene, I never > > > had the same document twice. :) Without boring everyone with the > > > custom code, what should I be looking for? Just cannot quite spot it. > > > > > > Cheers, > > > > > > Ivan > > > > > >