RE: Search performance with one index vs. many indexes

2005-02-28 Thread Runde, Kevin
Hi All,

Sorry about that please disregard that last email. I must not be fully
awake yet.

Sorry,
Kevin Runde 

-Original Message-
From: Runde, Kevin [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 28, 2005 7:34 AM
To: Lucene Users List
Subject: RE: Search performance with one index vs. many indexes

Follow Up to the article from Friday 

-Original Message-
From: Morus Walter [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 28, 2005 1:30 AM
To: Lucene Users List
Subject: Re: Search performance with one index vs. many indexes

Jochen Franke writes:
> Topic: Search performance with large numbers of indexes vs. one large
index
> 
> 
> My questions are:
> 
> - Is the size of the "wordlist" the problem?
> - Would we be a lot faster, when we have a smaller number
> of files per index?

sure. 
Look:
Index lookup of a word is O(ln(n)) where n is the number of words.
Index lookup of a word in k indexes having m words is O( k ln(m) )
In the best case all word lists are distict (purely theoretical), 
that is n = k*m or m = n/k
For n = 15 Mio, k = 800
ln(n) = 16.5
k*ln(n/k) = 7871
In a realistic case, m is much bigger since word lists won't be
distinct.
But it's the linear factor k that bites you.
In the worst case (all words in all indices) you have
k*ln(n) = 13218.8

HTH
Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search performance with one index vs. many indexes

2005-02-28 Thread Runde, Kevin
Follow Up to the article from Friday 

-Original Message-
From: Morus Walter [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 28, 2005 1:30 AM
To: Lucene Users List
Subject: Re: Search performance with one index vs. many indexes

Jochen Franke writes:
> Topic: Search performance with large numbers of indexes vs. one large
index
> 
> 
> My questions are:
> 
> - Is the size of the "wordlist" the problem?
> - Would we be a lot faster, when we have a smaller number
> of files per index?

sure. 
Look:
Index lookup of a word is O(ln(n)) where n is the number of words.
Index lookup of a word in k indexes having m words is O( k ln(m) )
In the best case all word lists are distict (purely theoretical), 
that is n = k*m or m = n/k
For n = 15 Mio, k = 800
ln(n) = 16.5
k*ln(n/k) = 7871
In a realistic case, m is much bigger since word lists won't be
distinct.
But it's the linear factor k that bites you.
In the worst case (all words in all indices) you have
k*ln(n) = 13218.8

HTH
Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search performance with one index vs. many indexes

2005-02-27 Thread Morus Walter
Jochen Franke writes:
> Topic: Search performance with large numbers of indexes vs. one large index
> 
> 
> My questions are:
> 
> - Is the size of the "wordlist" the problem?
> - Would we be a lot faster, when we have a smaller number
> of files per index?

sure. 
Look:
Index lookup of a word is O(ln(n)) where n is the number of words.
Index lookup of a word in k indexes having m words is O( k ln(m) )
In the best case all word lists are distict (purely theoretical), 
that is n = k*m or m = n/k
For n = 15 Mio, k = 800
ln(n) = 16.5
k*ln(n/k) = 7871
In a realistic case, m is much bigger since word lists won't be distinct.
But it's the linear factor k that bites you.
In the worst case (all words in all indices) you have
k*ln(n) = 13218.8

HTH
Morus

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Search performance with one index vs. many indexes

2005-02-25 Thread Jochen Franke
Topic: Search performance with large numbers of indexes vs. one large index
Hello,
we are experiencing a performance problem when using large
numbers of indexes.
We have an application with about
6 Mio. Documents
one index of about 7 GB
probably 10 to 15 million different words in that
index.
The creation of the index out of one DB (where the
documents are coming from) with two processor takes about 20 hours.
For several reasons (e.g. parallelizing the index creation), we
created several indexes, by splitting the documents into logical groups.
We first created an artifical benchmark:
10 Mio. Documents
500 Indexes (in about 3 files per index)
10 GB Index alltogether
about 5.000 randomly selected words
Querying this index took about 0.4s per query, so it was only
twice the time than querying index, which was fine for us.
We did the same with one index merged out of the 500 indexes.
The lucene search performance was fine here as well (about 0.2s per 
query on our machine).


We then implemented the "real thing" which is:
6 Mio. Documents
800 Indexes (with about 28 files per index)
about 7 GB index size
probably 10 to 15 million different words in that
index.
We now have a query performance of 4-8 seconds per query.
The test with the real data in one index has not been finished
so far.
My questions are:
- Is the size of the "wordlist" the problem?
- Would we be a lot faster, when we have a smaller number
of files per index?
- Is 500-1000 still a reasonable number of indexes?
- Is there a more or less a linear relationship between
the number of indexes and the execution time of the query
(as all indexes have to be checked and the results have
to be merged)?
- Are there any parameters that could be configured for
that usecase?
- Should we implement any specialized classes specific to our use case?
Thanks,
Jochen Franke
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Search Performance

2005-02-19 Thread sergiu gordea
Michael Celona wrote:
My index is changing in real time constantly... in this case I guess this
will not work for me any suggestions...
 

using a singleton pattern for the your index searcher makes sense anyway 
... I don'T think that you change
the index after each search. the computing effort is insignificant but 
the gain is.

How often do you optimize your index.
Run your jmeter tests before and after optimization!
Which is the value of your merge factor?
Try to use 2 or 3 and run the tests again.
I think it will be useful for lucene community to provide the results 
of your tests.

Best,
 Sergiu
Michael
-Original Message-
From: David Townsend [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 11:50 AM
To: Lucene Users List
Subject: RE: Search Performance

IndexSearchers are thread safe, so you can use the same object on multiple
requests.  If the index is static and not constantly updating, just keep one
IndexSearcher for the life of the app.  If the index changes and you need
that instantly reflected in the results, you need to check if the index has
changed, if it has create a new cached IndexSearcher.  To check for changes
use you'll need to monitor the version number of the index obtained via
IndexReader.getCurrentVersion(Index Name)
David
-Original Message-
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 16:15
To: Lucene Users List
Subject: Re: Search Performance
Try a singleton pattern or an static field.
Stefan
Michael Celona wrote:
 

I am creating new IndexSearchers... how do I cache my IndexSearcher...
Michael
-Original Message-
From: David Townsend [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 11:00 AM
To: Lucene Users List
Subject: RE: Search Performance

Are you creating new IndexSearchers or IndexReaders on each search?
   

Caching
 

your IndexSearchers has a dramatic effect on speed.
David Townsend
-Original Message-
From: Michael Celona [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 15:55
To: Lucene Users List
Subject: Search Performance
What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  


Michael
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

   


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Search Performance

2005-02-18 Thread Otis Gospodnetic
Yes, until it's cleaned up, and as soon as the last client is done with
Hits, the originating IndexSearcher is ready for cleanup if nobody else
is holding references to it.  You can close it explicityly, as you are
doing, too, no harm.

Otis

--- Chris Lamprecht <[EMAIL PROTECTED]> wrote:

> Wouldn't this leave open file handles?   I had a problem where there
> were lots of open file handles for deleted index files, because the
> old searchers were not being closed.
> 
> On Fri, 18 Feb 2005 13:41:37 -0800 (PST), Otis Gospodnetic
> <[EMAIL PROTECTED]> wrote:
> > Or you could just open a new IndexSearcher, forget the old one, and
> > have GC collect it when everyone is done with it.
> > 
> > Otis
> >
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search Performance

2005-02-18 Thread Chris Lamprecht
Wouldn't this leave open file handles?   I had a problem where there
were lots of open file handles for deleted index files, because the
old searchers were not being closed.

On Fri, 18 Feb 2005 13:41:37 -0800 (PST), Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:
> Or you could just open a new IndexSearcher, forget the old one, and
> have GC collect it when everyone is done with it.
> 
> Otis
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search Performance

2005-02-18 Thread David Spencer
Michael Celona wrote:
Just tried that... works like a charm... thanks...
Could you clarify what the problem was - just the overhead of opening 
IndexSearchers?
Michael
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 4:42 PM
To: Lucene Users List; Chris Lamprecht
Subject: Re: Search Performance

Or you could just open a new IndexSearcher, forget the old one, and
have GC collect it when everyone is done with it.
Otis
--- Chris Lamprecht <[EMAIL PROTECTED]> wrote:

I should have mentioned, the reason for not doing this the obvious,
simple way (just close the Searcher and reopen it if a new version is
available) is because some threads could be in the middle of
iterating
through the search Hits.  If you close the Searcher they get a Bad
file descriptor IOException.  As I found out the hard way :)
On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht
<[EMAIL PROTECTED]> wrote:
I recently dealt with the issue of re-using a Searcher with an
index
that changes often.  I wrote a class that allows my searching
classes
to "check out" a lucene Searcher, perform a search, and then return
the Searcher.  It's similar to a database connection pool, except
that
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Search Performance

2005-02-18 Thread Michael Celona
Just tried that... works like a charm... thanks...

Michael

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 4:42 PM
To: Lucene Users List; Chris Lamprecht
Subject: Re: Search Performance

Or you could just open a new IndexSearcher, forget the old one, and
have GC collect it when everyone is done with it.

Otis

--- Chris Lamprecht <[EMAIL PROTECTED]> wrote:

> I should have mentioned, the reason for not doing this the obvious,
> simple way (just close the Searcher and reopen it if a new version is
> available) is because some threads could be in the middle of
> iterating
> through the search Hits.  If you close the Searcher they get a Bad
> file descriptor IOException.  As I found out the hard way :)
> 
> 
> On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht
> <[EMAIL PROTECTED]> wrote:
> > I recently dealt with the issue of re-using a Searcher with an
> index
> > that changes often.  I wrote a class that allows my searching
> classes
> > to "check out" a lucene Searcher, perform a search, and then return
> > the Searcher.  It's similar to a database connection pool, except
> that
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search Performance

2005-02-18 Thread Otis Gospodnetic
Or you could just open a new IndexSearcher, forget the old one, and
have GC collect it when everyone is done with it.

Otis

--- Chris Lamprecht <[EMAIL PROTECTED]> wrote:

> I should have mentioned, the reason for not doing this the obvious,
> simple way (just close the Searcher and reopen it if a new version is
> available) is because some threads could be in the middle of
> iterating
> through the search Hits.  If you close the Searcher they get a Bad
> file descriptor IOException.  As I found out the hard way :)
> 
> 
> On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht
> <[EMAIL PROTECTED]> wrote:
> > I recently dealt with the issue of re-using a Searcher with an
> index
> > that changes often.  I wrote a class that allows my searching
> classes
> > to "check out" a lucene Searcher, perform a search, and then return
> > the Searcher.  It's similar to a database connection pool, except
> that
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
Thanks... I am seeing this problem right now Has anyone implemented a
better solution...?

Michael

-Original Message-
From: Chris Lamprecht [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 4:14 PM
To: Lucene Users List
Subject: Re: Search Performance

I should have mentioned, the reason for not doing this the obvious,
simple way (just close the Searcher and reopen it if a new version is
available) is because some threads could be in the middle of iterating
through the search Hits.  If you close the Searcher they get a Bad
file descriptor IOException.  As I found out the hard way :)


On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht
<[EMAIL PROTECTED]> wrote:
> I recently dealt with the issue of re-using a Searcher with an index
> that changes often.  I wrote a class that allows my searching classes
> to "check out" a lucene Searcher, perform a search, and then return
> the Searcher.  It's similar to a database connection pool, except that

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search Performance

2005-02-18 Thread Chris Lamprecht
I should have mentioned, the reason for not doing this the obvious,
simple way (just close the Searcher and reopen it if a new version is
available) is because some threads could be in the middle of iterating
through the search Hits.  If you close the Searcher they get a Bad
file descriptor IOException.  As I found out the hard way :)


On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht
<[EMAIL PROTECTED]> wrote:
> I recently dealt with the issue of re-using a Searcher with an index
> that changes often.  I wrote a class that allows my searching classes
> to "check out" a lucene Searcher, perform a search, and then return
> the Searcher.  It's similar to a database connection pool, except that

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search Performance

2005-02-18 Thread Chris Lamprecht
I recently dealt with the issue of re-using a Searcher with an index
that changes often.  I wrote a class that allows my searching classes
to "check out" a lucene Searcher, perform a search, and then return
the Searcher.  It's similar to a database connection pool, except that
all clients can share the same Searcher (I don't think there is any
benefit to keeping a true "pool" and giving a different Searcher to
each client -- someone let me know if this is incorrect).

So I just keep a reference count to my Searcher, which gets
incremented at checkout and decremented at checkin.  So the logic is
approximately:

initialize lastVersion to -1

checkout:
if (lucene index version != lastVersion) {
   create a new IndexSearcher and update lastVersion
} 
refcount++;
return the searcher

And on checkin:
refcount--;
if (refcount ==0 and there is a newer lucene index version) {
   close the searcher being checked in
}


Of course there are some more details to keep info on the open
searchers, make it thread-safe, etc.  I also plan to only check for a
new index if some minimum time threshold has passed (5 minutes or so).
 I'd be interested in hearing others' solutions/patterns for this.

-Chris

On Fri, 18 Feb 2005 11:57:32 -0500, Michael Celona
<[EMAIL PROTECTED]> wrote:
> My index is changing in real time constantly... in this case I guess this
> will not work for me any suggestions...
> 
> Michael
> 
> -Original Message-
> From: David Townsend [mailto:[EMAIL PROTECTED]
> Sent: Friday, February 18, 2005 11:50 AM
> To: Lucene Users List
> Subject: RE: Search Performance
> 
> IndexSearchers are thread safe, so you can use the same object on multiple
> requests.  If the index is static and not constantly updating, just keep one
> IndexSearcher for the life of the app.  If the index changes and you need
> that instantly reflected in the results, you need to check if the index has
> changed, if it has create a new cached IndexSearcher.  To check for changes
> use you'll need to monitor the version number of the index obtained via
> 
> IndexReader.getCurrentVersion(Index Name)
> 
> David
> 
> -Original Message-
> From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
> Sent: 18 February 2005 16:15
> To: Lucene Users List
> Subject: Re: Search Performance
> 
> Try a singleton pattern or an static field.
> 
> Stefan
> 
> Michael Celona wrote:
> 
> >I am creating new IndexSearchers... how do I cache my IndexSearcher...
> >
> >Michael
> >
> >-Original Message-
> >From: David Townsend [mailto:[EMAIL PROTECTED]
> >Sent: Friday, February 18, 2005 11:00 AM
> >To: Lucene Users List
> >Subject: RE: Search Performance
> >
> >Are you creating new IndexSearchers or IndexReaders on each search?
> Caching
> >your IndexSearchers has a dramatic effect on speed.
> >
> >David Townsend
> >
> >-Original Message-
> >From: Michael Celona [mailto:[EMAIL PROTECTED]
> >Sent: 18 February 2005 15:55
> >To: Lucene Users List
> >Subject: Search Performance
> >
> >
> >What is single handedly the best way to improve search performance?  I have
> >an index in the 2G range stored on the local file system of the searcher.
> >Under a load test of 5 simultaneous users my average search time is ~4700
> >ms.  Under a load test of 10 simultaneous users my average search time is
> >~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
> >Zeons.  Any ideas?
> >
> >
> >
> >Michael
> >
> >
> >-
> >To unsubscribe, e-mail: [EMAIL PROTECTED]
> >For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >
> >-
> >To unsubscribe, e-mail: [EMAIL PROTECTED]
> >For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >
> >
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
I am using the highlighter... does this matter



-Original Message-
From: David Spencer [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 2:05 PM
To: Lucene Users List
Subject: Re: Search Performance

Are you using the highlighter or doing anything non-trivial in 
displaying the results?

Are the pages being compressed (mod_gzip or some servlet equivalent)? 
This definitely helps, though to see the effect you may have to make 
sure your simulated users are "remote".

Also consider caching search results if it's reasonable to assume users 
may search for the same things.

I made some measurements on caching on my site:

http://www.searchmorph.com/weblog/index.php?id=41
http://www.searchmorph.com/weblog/index.php?id=40

And I use OSCache:

http://www.searchmorph.com/weblog/index.php?id=38
http://www.opensymphony.com/oscache/





Michael Celona wrote:

> What is single handedly the best way to improve search performance?  I
have
> an index in the 2G range stored on the local file system of the searcher.
> Under a load test of 5 simultaneous users my average search time is ~4700
> ms.  Under a load test of 10 simultaneous users my average search time is
> ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
> Zeons.  Any ideas?  
> 
>  
> 
> Michael
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search Performance

2005-02-18 Thread David Spencer
Are you using the highlighter or doing anything non-trivial in 
displaying the results?

Are the pages being compressed (mod_gzip or some servlet equivalent)? 
This definitely helps, though to see the effect you may have to make 
sure your simulated users are "remote".

Also consider caching search results if it's reasonable to assume users 
may search for the same things.

I made some measurements on caching on my site:
http://www.searchmorph.com/weblog/index.php?id=41
http://www.searchmorph.com/weblog/index.php?id=40
And I use OSCache:
http://www.searchmorph.com/weblog/index.php?id=38
http://www.opensymphony.com/oscache/


Michael Celona wrote:
What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  

 

Michael


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Search Performance

2005-02-18 Thread David Spencer
Noone has mentioned JVM options yet.
[a] -server
[b] -XX:CompileThreshold=1000
[c] Raise the -Xms value if you haven't done so (-Xms...)
I think by default the VM runs with "-client" but -server makes more 
sense for web containers (Tomcat etc).
[b] tells the hotspot compiler to compile methods sooner - you can lower 
the 1000 to, say, '2' makes it compile methods after they've executed 2 
times - I had trouble once lowering this to 1 for some reason


Also, even though you're not supposed to need to do this, I've found it 
helpful to force gc() periodically e.g. every minute via this idiom:

public static long gc()
{
long bef = mem();
System.gc();
sleep( 100);
System.runFinalization();
sleep( 100);
System.gc();
long aft= mem();
return aft-bef;
}
Michael Celona wrote:
What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  

 

Michael


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re[2]: Search Performance

2005-02-18 Thread Yura Smolsky
Hello, Michael.

btw, you can recreate IndexSeacher every 5|10|30|60|X minutes

MC> My index is changing in real time constantly... in this case I guess this
MC> will not work for me any suggestions...

MC> Michael

MC> -Original Message-
MC> From: David Townsend [mailto:[EMAIL PROTECTED] 
MC> Sent: Friday, February 18, 2005 11:50 AM
MC> To: Lucene Users List
MC> Subject: RE: Search Performance

MC> IndexSearchers are thread safe, so you can use the same object on multiple
MC> requests.  If the index is static and not constantly updating, just keep one
MC> IndexSearcher for the life of the app.  If the index changes and you need
MC> that instantly reflected in the results, you need to check if the index has
MC> changed, if it has create a new cached IndexSearcher.  To check for changes
MC> use you'll need to monitor the version number of the index obtained via

MC> IndexReader.getCurrentVersion(Index Name)

MC> David

MC> -Original Message-
MC> From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
MC> Sent: 18 February 2005 16:15
MC> To: Lucene Users List
MC> Subject: Re: Search Performance


MC> Try a singleton pattern or an static field.

MC> Stefan

MC> Michael Celona wrote:

>>I am creating new IndexSearchers... how do I cache my IndexSearcher...
>>
>>Michael
>>
>>-Original Message-
>>From: David Townsend [mailto:[EMAIL PROTECTED] 
>>Sent: Friday, February 18, 2005 11:00 AM
>>To: Lucene Users List
>>Subject: RE: Search Performance
>>
>>Are you creating new IndexSearchers or IndexReaders on each search?
MC> Caching
>>your IndexSearchers has a dramatic effect on speed.
>>
>>David Townsend
>>
>>-Original Message-
>>From: Michael Celona [mailto:[EMAIL PROTECTED]
>>Sent: 18 February 2005 15:55
>>To: Lucene Users List
>>Subject: Search Performance
>>
>>
>>What is single handedly the best way to improve search performance?  I have
>>an index in the 2G range stored on the local file system of the searcher.
>>Under a load test of 5 simultaneous users my average search time is ~4700
>>ms.  Under a load test of 10 simultaneous users my average search time is
>>~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
>>Zeons.  Any ideas?  
>>
>> 
>>
>>Michael
>>
>>
>>-
>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>>For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>>
>>-
>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>>For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>>  
>>


MC> -
MC> To unsubscribe, e-mail: [EMAIL PROTECTED]
MC> For additional commands, e-mail:
MC> [EMAIL PROTECTED]


MC> -
MC> To unsubscribe, e-mail: [EMAIL PROTECTED]
MC> For additional commands, e-mail:
MC> [EMAIL PROTECTED]



MC> -
MC> To unsubscribe, e-mail: [EMAIL PROTECTED]
MC> For additional commands, e-mail:
MC> [EMAIL PROTECTED]





Yura Smolsky.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread Michael Celona
My index is changing in real time constantly... in this case I guess this
will not work for me any suggestions...

Michael

-Original Message-
From: David Townsend [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 11:50 AM
To: Lucene Users List
Subject: RE: Search Performance

IndexSearchers are thread safe, so you can use the same object on multiple
requests.  If the index is static and not constantly updating, just keep one
IndexSearcher for the life of the app.  If the index changes and you need
that instantly reflected in the results, you need to check if the index has
changed, if it has create a new cached IndexSearcher.  To check for changes
use you'll need to monitor the version number of the index obtained via

IndexReader.getCurrentVersion(Index Name)

David

-Original Message-
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 16:15
To: Lucene Users List
Subject: Re: Search Performance


Try a singleton pattern or an static field.

Stefan

Michael Celona wrote:

>I am creating new IndexSearchers... how do I cache my IndexSearcher...
>
>Michael
>
>-Original Message-
>From: David Townsend [mailto:[EMAIL PROTECTED] 
>Sent: Friday, February 18, 2005 11:00 AM
>To: Lucene Users List
>Subject: RE: Search Performance
>
>Are you creating new IndexSearchers or IndexReaders on each search?
Caching
>your IndexSearchers has a dramatic effect on speed.
>
>David Townsend
>
>-Original Message-
>From: Michael Celona [mailto:[EMAIL PROTECTED]
>Sent: 18 February 2005 15:55
>To: Lucene Users List
>Subject: Search Performance
>
>
>What is single handedly the best way to improve search performance?  I have
>an index in the 2G range stored on the local file system of the searcher.
>Under a load test of 5 simultaneous users my average search time is ~4700
>ms.  Under a load test of 10 simultaneous users my average search time is
>~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
>Zeons.  Any ideas?  
>
> 
>
>Michael
>
>
>-
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>-
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>  
>


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread David Townsend
IndexSearchers are thread safe, so you can use the same object on multiple 
requests.  If the index is static and not constantly updating, just keep one 
IndexSearcher for the life of the app.  If the index changes and you need that 
instantly reflected in the results, you need to check if the index has changed, 
if it has create a new cached IndexSearcher.  To check for changes use you'll 
need to monitor the version number of the index obtained via

IndexReader.getCurrentVersion(Index Name)

David

-Original Message-
From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 16:15
To: Lucene Users List
Subject: Re: Search Performance


Try a singleton pattern or an static field.

Stefan

Michael Celona wrote:

>I am creating new IndexSearchers... how do I cache my IndexSearcher...
>
>Michael
>
>-Original Message-
>From: David Townsend [mailto:[EMAIL PROTECTED] 
>Sent: Friday, February 18, 2005 11:00 AM
>To: Lucene Users List
>Subject: RE: Search Performance
>
>Are you creating new IndexSearchers or IndexReaders on each search?  Caching
>your IndexSearchers has a dramatic effect on speed.
>
>David Townsend
>
>-Original Message-
>From: Michael Celona [mailto:[EMAIL PROTECTED]
>Sent: 18 February 2005 15:55
>To: Lucene Users List
>Subject: Search Performance
>
>
>What is single handedly the best way to improve search performance?  I have
>an index in the 2G range stored on the local file system of the searcher.
>Under a load test of 5 simultaneous users my average search time is ~4700
>ms.  Under a load test of 10 simultaneous users my average search time is
>~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
>Zeons.  Any ideas?  
>
> 
>
>Michael
>
>
>-
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
>-
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>  
>


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search Performance

2005-02-18 Thread Stefan Groschupf
Try a singleton pattern or an static field.
Stefan
Michael Celona wrote:
I am creating new IndexSearchers... how do I cache my IndexSearcher...
Michael
-Original Message-
From: David Townsend [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 11:00 AM
To: Lucene Users List
Subject: RE: Search Performance

Are you creating new IndexSearchers or IndexReaders on each search?  Caching
your IndexSearchers has a dramatic effect on speed.
David Townsend
-Original Message-
From: Michael Celona [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 15:55
To: Lucene Users List
Subject: Search Performance
What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  


Michael
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Search Performance

2005-02-18 Thread Michael Celona
I am creating new IndexSearchers... how do I cache my IndexSearcher...

Michael

-Original Message-
From: David Townsend [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 18, 2005 11:00 AM
To: Lucene Users List
Subject: RE: Search Performance

Are you creating new IndexSearchers or IndexReaders on each search?  Caching
your IndexSearchers has a dramatic effect on speed.

David Townsend

-Original Message-
From: Michael Celona [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 15:55
To: Lucene Users List
Subject: Search Performance


What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  

 

Michael


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search Performance

2005-02-18 Thread David Townsend
Are you creating new IndexSearchers or IndexReaders on each search?  Caching 
your IndexSearchers has a dramatic effect on speed.

David Townsend

-Original Message-
From: Michael Celona [mailto:[EMAIL PROTECTED]
Sent: 18 February 2005 15:55
To: Lucene Users List
Subject: Search Performance


What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  

 

Michael


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Search Performance

2005-02-18 Thread Michael Celona
What is single handedly the best way to improve search performance?  I have
an index in the 2G range stored on the local file system of the searcher.
Under a load test of 5 simultaneous users my average search time is ~4700
ms.  Under a load test of 10 simultaneous users my average search time is
~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz
Zeons.  Any ideas?  

 

Michael