Thanks Adrien. I added the BaseScorer to the gist, but I was hoping to achieve was which direction I should go into to debug this issue. I was not focusing on the scorers since I did not need to upgrade them and I actually do not think I ever wrote my one Scorer in Lucene. Taking the next few days off, so I will get around to looking back into it soon.
Ivan On Mon, Dec 28, 2015 at 5:41 PM, Adrien Grand <jpou...@gmail.com> wrote: > Ivan, I can't find the BaseScorer class in the gist. Maybe you forgot to > git add it? > > Le lun. 28 déc. 2015 à 23:07, Ivan Brusic <i...@brusic.com> a écrit : > > > Here is the complete code: > > https://gist.github.com/brusic/e3018a2e403f5707fa3e > > > > The code is not originally mine, so I do not take responsibility. Once I > > get things to perform correctly, I will do another pass with > improvements. > > Much of the custom code needs to be re-thought. > > > > The scorer is one class that I did not need to update, so I did not focus > > on it. Will do so now. > > > > Ivan > > > > On Mon, Dec 28, 2015 at 4:58 PM, Adrien Grand <jpou...@gmail.com> wrote: > > > > > Hi Ivan, > > > > > > It looks like your scorer is emitting the same document twice. Maybe > you > > > could try to use AssertingIndexSearcher in your test case, this is the > > kind > > > of things that it should catch. > > > > > > The only related Lucene 5 change that I can think of is that Lucene now > > > requires docs to be collected in order, did this scorer use to collect > > docs > > > out of order in Lucene 4? > > > > > > If that still doesn't help and if you can share the code of your > scorer, > > I > > > could give it a quick look. > > > > > > Le lun. 28 déc. 2015 à 22:18, Ivan Brusic <i...@brusic.com> a écrit : > > > > > > > I just migrated on ton of code from Lucene 4.10 to 5.4. Lots of > custom > > > > collectors, analyzers, queries, etc.. I have migrated other code > bases > > > from > > > > Lucene before (2->3, 3->4) and I always had one issue I could not > > > eyeball! > > > > > > > > When using a custom query, I get the same document twice in the > result > > > set. > > > > The changes I made for the upgrade had to do with the query/weight > API > > > > change. > > > > > > > > Without getting in the custom code, here is the simple test case: > > > > > > > > @BeforeClass > > > > public static void buildIndex() throws IOException { > > > > ANALYZER = new StandardAnalyzer(); > > > > IndexWriterConfig config = new IndexWriterConfig(ANALYZER); > > > > DIRECTORY = new RAMDirectory(); > > > > try (IndexWriter writer = new IndexWriter(DIRECTORY, config)) { > > > > // removed for brevity > > > > // repeated five times with different values > > > > Document doc = new Document(); > > > > doc.add(...); > > > > writer.addDocument(doc); > > > > } > > > > } > > > > > > > > @Test > > > > public void testQuery() throws IOException { > > > > try (IndexReader reader = DirectoryReader.open(DIRECTORY)) { > > > > IndexSearcher searcher = new IndexSearcher(reader); > > > > > > > > PriorityQuery query = new PriorityQuery(); > > > > query.add(new TermQuery(new Term("foo", "xyz"))); > > > > query.add(new TermQuery(new Term("bar", "xyz"))); > > > > query.add(new TermQuery(new Term("baz", "xyz"))); > > > > > > > > CheckHits.checkDocIds("Invalid docs", new int[] {4, 2, 0, 3}, > > > > result.scoreDocs); > > > > > > > > } > > > > > > > > There should be four unique results out of five since the second > > > > document (docId 1) does not contain the term xyz. The results instead > > > > contain 5 documents, with the first one repeated twice at the start: > > > > > > > > [doc=4 score=1.1976817 shardIndex=0, doc=4 score=1.1976817 > > > > shardIndex=0, doc=2 score=0.63170385 shardIndex=0, doc=0 > > > > score=0.37223506 shardIndex=0, doc=3 score=0.34156355 shardIndex=0] > > > > > > > > When using a BooleanQuery, the results are correct, so obviously the > > > > custom Query is failing somehow. In all my years of Lucene, I never > > > > had the same document twice. :) Without boring everyone with the > > > > custom code, what should I be looking for? Just cannot quite spot it. > > > > > > > > Cheers, > > > > > > > > Ivan > > > > > > > > > >