RE: Search performance with one index vs. many indexes
Hi All, Sorry about that please disregard that last email. I must not be fully awake yet. Sorry, Kevin Runde -Original Message- From: Runde, Kevin [mailto:[EMAIL PROTECTED] Sent: Monday, February 28, 2005 7:34 AM To: Lucene Users List Subject: RE: Search performance with one index vs. many indexes Follow Up to the article from Friday -Original Message- From: Morus Walter [mailto:[EMAIL PROTECTED] Sent: Monday, February 28, 2005 1:30 AM To: Lucene Users List Subject: Re: Search performance with one index vs. many indexes Jochen Franke writes: > Topic: Search performance with large numbers of indexes vs. one large index > > > My questions are: > > - Is the size of the "wordlist" the problem? > - Would we be a lot faster, when we have a smaller number > of files per index? sure. Look: Index lookup of a word is O(ln(n)) where n is the number of words. Index lookup of a word in k indexes having m words is O( k ln(m) ) In the best case all word lists are distict (purely theoretical), that is n = k*m or m = n/k For n = 15 Mio, k = 800 ln(n) = 16.5 k*ln(n/k) = 7871 In a realistic case, m is much bigger since word lists won't be distinct. But it's the linear factor k that bites you. In the worst case (all words in all indices) you have k*ln(n) = 13218.8 HTH Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search performance with one index vs. many indexes
Follow Up to the article from Friday -Original Message- From: Morus Walter [mailto:[EMAIL PROTECTED] Sent: Monday, February 28, 2005 1:30 AM To: Lucene Users List Subject: Re: Search performance with one index vs. many indexes Jochen Franke writes: > Topic: Search performance with large numbers of indexes vs. one large index > > > My questions are: > > - Is the size of the "wordlist" the problem? > - Would we be a lot faster, when we have a smaller number > of files per index? sure. Look: Index lookup of a word is O(ln(n)) where n is the number of words. Index lookup of a word in k indexes having m words is O( k ln(m) ) In the best case all word lists are distict (purely theoretical), that is n = k*m or m = n/k For n = 15 Mio, k = 800 ln(n) = 16.5 k*ln(n/k) = 7871 In a realistic case, m is much bigger since word lists won't be distinct. But it's the linear factor k that bites you. In the worst case (all words in all indices) you have k*ln(n) = 13218.8 HTH Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search performance with one index vs. many indexes
Jochen Franke writes: > Topic: Search performance with large numbers of indexes vs. one large index > > > My questions are: > > - Is the size of the "wordlist" the problem? > - Would we be a lot faster, when we have a smaller number > of files per index? sure. Look: Index lookup of a word is O(ln(n)) where n is the number of words. Index lookup of a word in k indexes having m words is O( k ln(m) ) In the best case all word lists are distict (purely theoretical), that is n = k*m or m = n/k For n = 15 Mio, k = 800 ln(n) = 16.5 k*ln(n/k) = 7871 In a realistic case, m is much bigger since word lists won't be distinct. But it's the linear factor k that bites you. In the worst case (all words in all indices) you have k*ln(n) = 13218.8 HTH Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Search performance with one index vs. many indexes
Topic: Search performance with large numbers of indexes vs. one large index Hello, we are experiencing a performance problem when using large numbers of indexes. We have an application with about 6 Mio. Documents one index of about 7 GB probably 10 to 15 million different words in that index. The creation of the index out of one DB (where the documents are coming from) with two processor takes about 20 hours. For several reasons (e.g. parallelizing the index creation), we created several indexes, by splitting the documents into logical groups. We first created an artifical benchmark: 10 Mio. Documents 500 Indexes (in about 3 files per index) 10 GB Index alltogether about 5.000 randomly selected words Querying this index took about 0.4s per query, so it was only twice the time than querying index, which was fine for us. We did the same with one index merged out of the 500 indexes. The lucene search performance was fine here as well (about 0.2s per query on our machine). We then implemented the "real thing" which is: 6 Mio. Documents 800 Indexes (with about 28 files per index) about 7 GB index size probably 10 to 15 million different words in that index. We now have a query performance of 4-8 seconds per query. The test with the real data in one index has not been finished so far. My questions are: - Is the size of the "wordlist" the problem? - Would we be a lot faster, when we have a smaller number of files per index? - Is 500-1000 still a reasonable number of indexes? - Is there a more or less a linear relationship between the number of indexes and the execution time of the query (as all indexes have to be checked and the results have to be merged)? - Are there any parameters that could be configured for that usecase? - Should we implement any specialized classes specific to our use case? Thanks, Jochen Franke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Michael Celona wrote: My index is changing in real time constantly... in this case I guess this will not work for me any suggestions... using a singleton pattern for the your index searcher makes sense anyway ... I don'T think that you change the index after each search. the computing effort is insignificant but the gain is. How often do you optimize your index. Run your jmeter tests before and after optimization! Which is the value of your merge factor? Try to use 2 or 3 and run the tests again. I think it will be useful for lucene community to provide the results of your tests. Best, Sergiu Michael -Original Message- From: David Townsend [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 11:50 AM To: Lucene Users List Subject: RE: Search Performance IndexSearchers are thread safe, so you can use the same object on multiple requests. If the index is static and not constantly updating, just keep one IndexSearcher for the life of the app. If the index changes and you need that instantly reflected in the results, you need to check if the index has changed, if it has create a new cached IndexSearcher. To check for changes use you'll need to monitor the version number of the index obtained via IndexReader.getCurrentVersion(Index Name) David -Original Message- From: Stefan Groschupf [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 16:15 To: Lucene Users List Subject: Re: Search Performance Try a singleton pattern or an static field. Stefan Michael Celona wrote: I am creating new IndexSearchers... how do I cache my IndexSearcher... Michael -Original Message- From: David Townsend [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 11:00 AM To: Lucene Users List Subject: RE: Search Performance Are you creating new IndexSearchers or IndexReaders on each search? Caching your IndexSearchers has a dramatic effect on speed. David Townsend -Original Message- From: Michael Celona [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 15:55 To: Lucene Users List Subject: Search Performance What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Yes, until it's cleaned up, and as soon as the last client is done with Hits, the originating IndexSearcher is ready for cleanup if nobody else is holding references to it. You can close it explicityly, as you are doing, too, no harm. Otis --- Chris Lamprecht <[EMAIL PROTECTED]> wrote: > Wouldn't this leave open file handles? I had a problem where there > were lots of open file handles for deleted index files, because the > old searchers were not being closed. > > On Fri, 18 Feb 2005 13:41:37 -0800 (PST), Otis Gospodnetic > <[EMAIL PROTECTED]> wrote: > > Or you could just open a new IndexSearcher, forget the old one, and > > have GC collect it when everyone is done with it. > > > > Otis > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Wouldn't this leave open file handles? I had a problem where there were lots of open file handles for deleted index files, because the old searchers were not being closed. On Fri, 18 Feb 2005 13:41:37 -0800 (PST), Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Or you could just open a new IndexSearcher, forget the old one, and > have GC collect it when everyone is done with it. > > Otis > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Michael Celona wrote: Just tried that... works like a charm... thanks... Could you clarify what the problem was - just the overhead of opening IndexSearchers? Michael -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 4:42 PM To: Lucene Users List; Chris Lamprecht Subject: Re: Search Performance Or you could just open a new IndexSearcher, forget the old one, and have GC collect it when everyone is done with it. Otis --- Chris Lamprecht <[EMAIL PROTECTED]> wrote: I should have mentioned, the reason for not doing this the obvious, simple way (just close the Searcher and reopen it if a new version is available) is because some threads could be in the middle of iterating through the search Hits. If you close the Searcher they get a Bad file descriptor IOException. As I found out the hard way :) On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht <[EMAIL PROTECTED]> wrote: I recently dealt with the issue of re-using a Searcher with an index that changes often. I wrote a class that allows my searching classes to "check out" a lucene Searcher, perform a search, and then return the Searcher. It's similar to a database connection pool, except that - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
Just tried that... works like a charm... thanks... Michael -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 4:42 PM To: Lucene Users List; Chris Lamprecht Subject: Re: Search Performance Or you could just open a new IndexSearcher, forget the old one, and have GC collect it when everyone is done with it. Otis --- Chris Lamprecht <[EMAIL PROTECTED]> wrote: > I should have mentioned, the reason for not doing this the obvious, > simple way (just close the Searcher and reopen it if a new version is > available) is because some threads could be in the middle of > iterating > through the search Hits. If you close the Searcher they get a Bad > file descriptor IOException. As I found out the hard way :) > > > On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht > <[EMAIL PROTECTED]> wrote: > > I recently dealt with the issue of re-using a Searcher with an > index > > that changes often. I wrote a class that allows my searching > classes > > to "check out" a lucene Searcher, perform a search, and then return > > the Searcher. It's similar to a database connection pool, except > that > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Or you could just open a new IndexSearcher, forget the old one, and have GC collect it when everyone is done with it. Otis --- Chris Lamprecht <[EMAIL PROTECTED]> wrote: > I should have mentioned, the reason for not doing this the obvious, > simple way (just close the Searcher and reopen it if a new version is > available) is because some threads could be in the middle of > iterating > through the search Hits. If you close the Searcher they get a Bad > file descriptor IOException. As I found out the hard way :) > > > On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht > <[EMAIL PROTECTED]> wrote: > > I recently dealt with the issue of re-using a Searcher with an > index > > that changes often. I wrote a class that allows my searching > classes > > to "check out" a lucene Searcher, perform a search, and then return > > the Searcher. It's similar to a database connection pool, except > that > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
Thanks... I am seeing this problem right now Has anyone implemented a better solution...? Michael -Original Message- From: Chris Lamprecht [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 4:14 PM To: Lucene Users List Subject: Re: Search Performance I should have mentioned, the reason for not doing this the obvious, simple way (just close the Searcher and reopen it if a new version is available) is because some threads could be in the middle of iterating through the search Hits. If you close the Searcher they get a Bad file descriptor IOException. As I found out the hard way :) On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht <[EMAIL PROTECTED]> wrote: > I recently dealt with the issue of re-using a Searcher with an index > that changes often. I wrote a class that allows my searching classes > to "check out" a lucene Searcher, perform a search, and then return > the Searcher. It's similar to a database connection pool, except that - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
I should have mentioned, the reason for not doing this the obvious, simple way (just close the Searcher and reopen it if a new version is available) is because some threads could be in the middle of iterating through the search Hits. If you close the Searcher they get a Bad file descriptor IOException. As I found out the hard way :) On Fri, 18 Feb 2005 15:03:29 -0600, Chris Lamprecht <[EMAIL PROTECTED]> wrote: > I recently dealt with the issue of re-using a Searcher with an index > that changes often. I wrote a class that allows my searching classes > to "check out" a lucene Searcher, perform a search, and then return > the Searcher. It's similar to a database connection pool, except that - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
I recently dealt with the issue of re-using a Searcher with an index that changes often. I wrote a class that allows my searching classes to "check out" a lucene Searcher, perform a search, and then return the Searcher. It's similar to a database connection pool, except that all clients can share the same Searcher (I don't think there is any benefit to keeping a true "pool" and giving a different Searcher to each client -- someone let me know if this is incorrect). So I just keep a reference count to my Searcher, which gets incremented at checkout and decremented at checkin. So the logic is approximately: initialize lastVersion to -1 checkout: if (lucene index version != lastVersion) { create a new IndexSearcher and update lastVersion } refcount++; return the searcher And on checkin: refcount--; if (refcount ==0 and there is a newer lucene index version) { close the searcher being checked in } Of course there are some more details to keep info on the open searchers, make it thread-safe, etc. I also plan to only check for a new index if some minimum time threshold has passed (5 minutes or so). I'd be interested in hearing others' solutions/patterns for this. -Chris On Fri, 18 Feb 2005 11:57:32 -0500, Michael Celona <[EMAIL PROTECTED]> wrote: > My index is changing in real time constantly... in this case I guess this > will not work for me any suggestions... > > Michael > > -Original Message- > From: David Townsend [mailto:[EMAIL PROTECTED] > Sent: Friday, February 18, 2005 11:50 AM > To: Lucene Users List > Subject: RE: Search Performance > > IndexSearchers are thread safe, so you can use the same object on multiple > requests. If the index is static and not constantly updating, just keep one > IndexSearcher for the life of the app. If the index changes and you need > that instantly reflected in the results, you need to check if the index has > changed, if it has create a new cached IndexSearcher. To check for changes > use you'll need to monitor the version number of the index obtained via > > IndexReader.getCurrentVersion(Index Name) > > David > > -Original Message- > From: Stefan Groschupf [mailto:[EMAIL PROTECTED] > Sent: 18 February 2005 16:15 > To: Lucene Users List > Subject: Re: Search Performance > > Try a singleton pattern or an static field. > > Stefan > > Michael Celona wrote: > > >I am creating new IndexSearchers... how do I cache my IndexSearcher... > > > >Michael > > > >-Original Message- > >From: David Townsend [mailto:[EMAIL PROTECTED] > >Sent: Friday, February 18, 2005 11:00 AM > >To: Lucene Users List > >Subject: RE: Search Performance > > > >Are you creating new IndexSearchers or IndexReaders on each search? > Caching > >your IndexSearchers has a dramatic effect on speed. > > > >David Townsend > > > >-Original Message- > >From: Michael Celona [mailto:[EMAIL PROTECTED] > >Sent: 18 February 2005 15:55 > >To: Lucene Users List > >Subject: Search Performance > > > > > >What is single handedly the best way to improve search performance? I have > >an index in the 2G range stored on the local file system of the searcher. > >Under a load test of 5 simultaneous users my average search time is ~4700 > >ms. Under a load test of 10 simultaneous users my average search time is > >~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz > >Zeons. Any ideas? > > > > > > > >Michael > > > > > >- > >To unsubscribe, e-mail: [EMAIL PROTECTED] > >For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > >- > >To unsubscribe, e-mail: [EMAIL PROTECTED] > >For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
I am using the highlighter... does this matter -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 2:05 PM To: Lucene Users List Subject: Re: Search Performance Are you using the highlighter or doing anything non-trivial in displaying the results? Are the pages being compressed (mod_gzip or some servlet equivalent)? This definitely helps, though to see the effect you may have to make sure your simulated users are "remote". Also consider caching search results if it's reasonable to assume users may search for the same things. I made some measurements on caching on my site: http://www.searchmorph.com/weblog/index.php?id=41 http://www.searchmorph.com/weblog/index.php?id=40 And I use OSCache: http://www.searchmorph.com/weblog/index.php?id=38 http://www.opensymphony.com/oscache/ Michael Celona wrote: > What is single handedly the best way to improve search performance? I have > an index in the 2G range stored on the local file system of the searcher. > Under a load test of 5 simultaneous users my average search time is ~4700 > ms. Under a load test of 10 simultaneous users my average search time is > ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz > Zeons. Any ideas? > > > > Michael > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Are you using the highlighter or doing anything non-trivial in displaying the results? Are the pages being compressed (mod_gzip or some servlet equivalent)? This definitely helps, though to see the effect you may have to make sure your simulated users are "remote". Also consider caching search results if it's reasonable to assume users may search for the same things. I made some measurements on caching on my site: http://www.searchmorph.com/weblog/index.php?id=41 http://www.searchmorph.com/weblog/index.php?id=40 And I use OSCache: http://www.searchmorph.com/weblog/index.php?id=38 http://www.opensymphony.com/oscache/ Michael Celona wrote: What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Noone has mentioned JVM options yet. [a] -server [b] -XX:CompileThreshold=1000 [c] Raise the -Xms value if you haven't done so (-Xms...) I think by default the VM runs with "-client" but -server makes more sense for web containers (Tomcat etc). [b] tells the hotspot compiler to compile methods sooner - you can lower the 1000 to, say, '2' makes it compile methods after they've executed 2 times - I had trouble once lowering this to 1 for some reason Also, even though you're not supposed to need to do this, I've found it helpful to force gc() periodically e.g. every minute via this idiom: public static long gc() { long bef = mem(); System.gc(); sleep( 100); System.runFinalization(); sleep( 100); System.gc(); long aft= mem(); return aft-bef; } Michael Celona wrote: What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re[2]: Search Performance
Hello, Michael. btw, you can recreate IndexSeacher every 5|10|30|60|X minutes MC> My index is changing in real time constantly... in this case I guess this MC> will not work for me any suggestions... MC> Michael MC> -Original Message- MC> From: David Townsend [mailto:[EMAIL PROTECTED] MC> Sent: Friday, February 18, 2005 11:50 AM MC> To: Lucene Users List MC> Subject: RE: Search Performance MC> IndexSearchers are thread safe, so you can use the same object on multiple MC> requests. If the index is static and not constantly updating, just keep one MC> IndexSearcher for the life of the app. If the index changes and you need MC> that instantly reflected in the results, you need to check if the index has MC> changed, if it has create a new cached IndexSearcher. To check for changes MC> use you'll need to monitor the version number of the index obtained via MC> IndexReader.getCurrentVersion(Index Name) MC> David MC> -Original Message- MC> From: Stefan Groschupf [mailto:[EMAIL PROTECTED] MC> Sent: 18 February 2005 16:15 MC> To: Lucene Users List MC> Subject: Re: Search Performance MC> Try a singleton pattern or an static field. MC> Stefan MC> Michael Celona wrote: >>I am creating new IndexSearchers... how do I cache my IndexSearcher... >> >>Michael >> >>-Original Message- >>From: David Townsend [mailto:[EMAIL PROTECTED] >>Sent: Friday, February 18, 2005 11:00 AM >>To: Lucene Users List >>Subject: RE: Search Performance >> >>Are you creating new IndexSearchers or IndexReaders on each search? MC> Caching >>your IndexSearchers has a dramatic effect on speed. >> >>David Townsend >> >>-Original Message- >>From: Michael Celona [mailto:[EMAIL PROTECTED] >>Sent: 18 February 2005 15:55 >>To: Lucene Users List >>Subject: Search Performance >> >> >>What is single handedly the best way to improve search performance? I have >>an index in the 2G range stored on the local file system of the searcher. >>Under a load test of 5 simultaneous users my average search time is ~4700 >>ms. Under a load test of 10 simultaneous users my average search time is >>~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz >>Zeons. Any ideas? >> >> >> >>Michael >> >> >>- >>To unsubscribe, e-mail: [EMAIL PROTECTED] >>For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> >>- >>To unsubscribe, e-mail: [EMAIL PROTECTED] >>For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> >> MC> - MC> To unsubscribe, e-mail: [EMAIL PROTECTED] MC> For additional commands, e-mail: MC> [EMAIL PROTECTED] MC> - MC> To unsubscribe, e-mail: [EMAIL PROTECTED] MC> For additional commands, e-mail: MC> [EMAIL PROTECTED] MC> - MC> To unsubscribe, e-mail: [EMAIL PROTECTED] MC> For additional commands, e-mail: MC> [EMAIL PROTECTED] Yura Smolsky. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
My index is changing in real time constantly... in this case I guess this will not work for me any suggestions... Michael -Original Message- From: David Townsend [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 11:50 AM To: Lucene Users List Subject: RE: Search Performance IndexSearchers are thread safe, so you can use the same object on multiple requests. If the index is static and not constantly updating, just keep one IndexSearcher for the life of the app. If the index changes and you need that instantly reflected in the results, you need to check if the index has changed, if it has create a new cached IndexSearcher. To check for changes use you'll need to monitor the version number of the index obtained via IndexReader.getCurrentVersion(Index Name) David -Original Message- From: Stefan Groschupf [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 16:15 To: Lucene Users List Subject: Re: Search Performance Try a singleton pattern or an static field. Stefan Michael Celona wrote: >I am creating new IndexSearchers... how do I cache my IndexSearcher... > >Michael > >-Original Message- >From: David Townsend [mailto:[EMAIL PROTECTED] >Sent: Friday, February 18, 2005 11:00 AM >To: Lucene Users List >Subject: RE: Search Performance > >Are you creating new IndexSearchers or IndexReaders on each search? Caching >your IndexSearchers has a dramatic effect on speed. > >David Townsend > >-Original Message- >From: Michael Celona [mailto:[EMAIL PROTECTED] >Sent: 18 February 2005 15:55 >To: Lucene Users List >Subject: Search Performance > > >What is single handedly the best way to improve search performance? I have >an index in the 2G range stored on the local file system of the searcher. >Under a load test of 5 simultaneous users my average search time is ~4700 >ms. Under a load test of 10 simultaneous users my average search time is >~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz >Zeons. Any ideas? > > > >Michael > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
IndexSearchers are thread safe, so you can use the same object on multiple requests. If the index is static and not constantly updating, just keep one IndexSearcher for the life of the app. If the index changes and you need that instantly reflected in the results, you need to check if the index has changed, if it has create a new cached IndexSearcher. To check for changes use you'll need to monitor the version number of the index obtained via IndexReader.getCurrentVersion(Index Name) David -Original Message- From: Stefan Groschupf [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 16:15 To: Lucene Users List Subject: Re: Search Performance Try a singleton pattern or an static field. Stefan Michael Celona wrote: >I am creating new IndexSearchers... how do I cache my IndexSearcher... > >Michael > >-Original Message- >From: David Townsend [mailto:[EMAIL PROTECTED] >Sent: Friday, February 18, 2005 11:00 AM >To: Lucene Users List >Subject: RE: Search Performance > >Are you creating new IndexSearchers or IndexReaders on each search? Caching >your IndexSearchers has a dramatic effect on speed. > >David Townsend > >-Original Message- >From: Michael Celona [mailto:[EMAIL PROTECTED] >Sent: 18 February 2005 15:55 >To: Lucene Users List >Subject: Search Performance > > >What is single handedly the best way to improve search performance? I have >an index in the 2G range stored on the local file system of the searcher. >Under a load test of 5 simultaneous users my average search time is ~4700 >ms. Under a load test of 10 simultaneous users my average search time is >~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz >Zeons. Any ideas? > > > >Michael > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Search Performance
Try a singleton pattern or an static field. Stefan Michael Celona wrote: I am creating new IndexSearchers... how do I cache my IndexSearcher... Michael -Original Message- From: David Townsend [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 11:00 AM To: Lucene Users List Subject: RE: Search Performance Are you creating new IndexSearchers or IndexReaders on each search? Caching your IndexSearchers has a dramatic effect on speed. David Townsend -Original Message- From: Michael Celona [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 15:55 To: Lucene Users List Subject: Search Performance What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
I am creating new IndexSearchers... how do I cache my IndexSearcher... Michael -Original Message- From: David Townsend [mailto:[EMAIL PROTECTED] Sent: Friday, February 18, 2005 11:00 AM To: Lucene Users List Subject: RE: Search Performance Are you creating new IndexSearchers or IndexReaders on each search? Caching your IndexSearchers has a dramatic effect on speed. David Townsend -Original Message- From: Michael Celona [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 15:55 To: Lucene Users List Subject: Search Performance What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Search Performance
Are you creating new IndexSearchers or IndexReaders on each search? Caching your IndexSearchers has a dramatic effect on speed. David Townsend -Original Message- From: Michael Celona [mailto:[EMAIL PROTECTED] Sent: 18 February 2005 15:55 To: Lucene Users List Subject: Search Performance What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Search Performance
What is single handedly the best way to improve search performance? I have an index in the 2G range stored on the local file system of the searcher. Under a load test of 5 simultaneous users my average search time is ~4700 ms. Under a load test of 10 simultaneous users my average search time is ~1 ms.I have given the JVM 2G of memory and am a using a dual 3GHz Zeons. Any ideas? Michael