Re: Lucene performance: is search time linear to the index size?

2009-06-17 Thread Ian Lea
It depends on lots of things, but the time to execute a search would not typically grow linearly with the number of documents. But the time to retrieve data from all the hits might, if the number of hits is growing in line with the number of documents. Are you doing that by any chance, as opposed

Re: Lucene performance: is search time linear to the index size?

2009-06-17 Thread Erick Erickson
Are you measuring search time *only* or are you measuring total response time including assembling whatever you assemble? If you're measuring total response time, everything from network latency to what you're doing with each hit may affect response time. This is especially true if you're iteratin

RE: Lucene performance: is search time linear to the index size?

2009-06-17 Thread Teruhiko Kurosaka
June 17, 2009 9:09 AM > To: java-user@lucene.apache.org > Subject: Re: Lucene performance: is search time linear to the > index size? > > Are you measuring search time *only* or are you measuring > total response time including assembling whatever you > assemble? If y

Re: Lucene performance: is search time linear to the index size?

2009-06-17 Thread Peter Keegan
:erickerick...@gmail.com] > > Sent: Wednesday, June 17, 2009 9:09 AM > > To: java-user@lucene.apache.org > > Subject: Re: Lucene performance: is search time linear to the > > index size? > > > > Are you measuring search time *only* or are you measuring > >

RE: Lucene performance: is search time linear to the index size?

2009-06-17 Thread Teruhiko Kurosaka
I've written a test program that uses the simplest form of search, TermQuery and measure the time it takes to search a term in a field on indices of various sizes. The result is a very linear growth of search time vs the index size in terms of # of Documents, not # of unique terms in that field.

Re: Lucene performance: is search time linear to the index size?

2009-06-18 Thread Erick Erickson
Opening a searcher and doing the first query incurs a significant amount of overhead, cache loading, etc. Inferring search times relative to index size with a program like you describe is unreliable. Try firing a few queries at the index without measuring, *then* measure the time it takes for subs

RE: Lucene performance: is search time linear to the index size?

2009-06-18 Thread Teruhiko Kurosaka
Subject: Re: Lucene performance: is search time linear to the > index size? > > Opening a searcher and doing the first query incurs a > significant amount of overhead, cache loading, etc. Inferring > search times relative to index size with a program like you > describe is u

RE: Lucene performance: is search time linear to the index size?

2009-06-18 Thread Jay Booth
disk, not the search time. -Original Message- From: Teruhiko Kurosaka [mailto:k...@basistech.com] Sent: Thursday, June 18, 2009 2:55 PM To: java-user@lucene.apache.org Subject: RE: Lucene performance: is search time linear to the index size? Erik, The way I test this program is by is

Re: Lucene performance: is search time linear to the index size?

2009-06-18 Thread Yonik Seeley
ous clauses of the query. -Yonik http://www.lucidimagination.com > -kuro > >> -Original Message- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Thursday, June 18, 2009 12:44 AM >> To: java-user@lucene.apache.org >> Subject: Re: Lucene p

RE: Lucene performance: is search time linear to the index size?

2009-06-18 Thread Teruhiko Kurosaka
> From: Jay Booth [mailto:jbo...@wgen.net] > Are you fetching all of the results for your search? No, I'm not doing anything on the search results. This is essentially what I do: searcher = new IndexSearcher(IndexReader.open(indexFileDir)); query = new TermQuery(new Term(fieldNam

Re: Lucene performance: is search time linear to the index size?

2009-06-19 Thread Joel Halbert
gt; >> Sent: Thursday, June 18, 2009 12:44 AM > >> To: java-user@lucene.apache.org > >> Subject: Re: Lucene performance: is search time linear to the > >> index size? > >> > >> Opening a searcher and doing the first query incurs a > >> significan