On Thu, Jun 18, 2009 at 3:54 PM, Teruhiko Kurosaka<k...@basistech.com> wrote: > Because the number of hits was proportinoal to the number > of Documents in the index in my previous test, I came > to a wrong conclusion that the search time is proportional > to the index size. If I have only one Document that can > matches with a query, the search time remains constant no > matter how large the index is.
Right. An inverted index contains a list of documents that match each term, so ignoring other overhead and effects, search time is proportional to the number of documents matching the various clauses of the query. -Yonik http://www.lucidimagination.com > -kuro > >> -----Original Message----- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Thursday, June 18, 2009 12:44 AM >> To: java-user@lucene.apache.org >> Subject: Re: Lucene performance: is search time linear to the >> index size? >> >> Opening a searcher and doing the first query incurs a >> significant amount of overhead, cache loading, etc. Inferring >> search times relative to index size with a program like you >> describe is unreliable. >> >> Try firing a few queries at the index without measuring, >> *then* measure the time it takes for subsequent queries and >> you'll get a much better picture of actual response time. >> >> The fact that a program that fires a single query at a newly >> opened reader has near-linear performance isn't as surprising >> as all that. I'd be more concerned if, say, queries 10 >> through 100 *on the same underlying reader* displayed this behavior. >> >> See: >> >> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed?highl >> ight=(warming) >> >> especially the questions around: >> *When measuring performance, disregard the first query >> >> Best >> Erick >> * >> On Thu, Jun 18, 2009 at 12:49 AM, Teruhiko Kurosaka >> <k...@basistech.com>wrote: >> >> > I've written a test program that uses the simplest form of search, >> > TermQuery and measure the time it takes to search a term in >> a field on >> > indices of various sizes. >> > >> > The result is a very linear growth of search time vs the >> index size in >> > terms of # of Documents, not # of unique terms in that field. >> > >> > -kuro >> > >> > >> --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org