"Why would it matter...top 5 matches" Because Lucene has to calculate the score of all documents in order to insure that it returns those 5 documents. What if the very last document scored was the most relevant?
Best Erick On Sun, Oct 23, 2011 at 3:06 PM, sol myr <solmy...@yahoo.com> wrote: > Hi, > > We've noticed some Lucene performance phenomenon, and would appreciate an > explanation from anyone familiar with Lucene internals > > (I know Lucene as a user, but haven't looked under its hood). > > We have a Lucene index of about 30 million records. > We ran 2 queries: "AND" and "OR" ("+john +doe" versus "john doe"). > The AND query had much better performance (AND takes about 500 millis, while > OR takes about 2000 millis). > > We wondered whether this has anything to do with the number of potential > matches? > Our AND has only about 5000 matches (5000 documents contain *both* "john" and > "doe"). > Our OR has about 8 million matches (8 million documents contain *either* > "john" or "doe"). > > > Does this explain the performance difference? > But why would it matter, as long as we take only the top 5 matches ( > indexSearcher.search(query, 5))...? > Is there any other explanation? > > Thanks :) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org