Hello, With the new Lucene 2.9.0 (on a newly built index of approx. 30 million documents) running BooleanQueries containing PhraseQuery does not work properly. I've verified this on both optimized and unoptimized index versions.
For example: lucli> count field1:"john doe" Searching for: field1:"john doe" 496 total documents lucli> count +(field1:"john doe") Searching for: +field1:"john doe" 496 total documents lucli> count +(field1:"john doe" field1:"john doe") Searching for: +(field1:"john doe" field1:"john doe") 5 total documents lucli> count +(+field1:"john doe" field1:"john doe") Searching for: +(+field1:"john doe" field1:"john doe") 496 total documents lucli> count +(field1:"john doe" field2:UnmatchedValue) Searching for: +(field1:"john doe" field2:UnmatchedValue) 5 total documents lucli> count +(+field1:"john doe" field2:UnmatchedValue) Searching for: +(+field1:"john doe" field2:UnmatchedValue) 496 total documents This was also verifiable when I searched using TopScoreDocCollector(N, true|false), with the call using docsScoredInOrder=false producing incorrect results. While debugging I've noticed that for the BooleanQuery containing at least one MUST clause BooleanScorer2 is used and this produces the correct number of results, while for BooleanQuery that don't contain any MUST clause BooleanScorer.score(Collector, int, int) selects up to a certain number of docs and then it exits prematurely. Is this behaviour normal? This used to work in Lucene 2.4.x. I've noticed another user mentioning a similar behaviour (http://mail-archives.apache.org/mod_mbox/lucene-java-user/200910.mbox/%3c20091008121147.107a8...@pc-4176.kl.dfki.de%3e), but in my case it's a newly built index, not one that was migrated from 2.4 to 2.9. Thanks, Ionut --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org