CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
Hi! Is there are particular reason why CachingWrapperFilter caches per IndexReader and not per IndexReader.directory()? If there are multiple IndexSearcher/IndexReader instances (and only one Directory) cache will be built and held in memory redundantly. I don't see any sense in doing so (?).

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Grant Ingersoll
My guess would be b/c best practice is usually to only have one Reader/ Searcher per Directory, but I don't know if that is the real reason. Most discussions/testing I have seen indicate a single Reader/Searcher performs best. -Grant On Jan 1, 2008, at 11:57 AM, Timo Nentwig wrote: Hi!

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Shailendra Sharma
> > Is there are particular reason why CachingWrapperFilter caches per > IndexReader > and not per IndexReader.directory()? If there are multiple > IndexSearcher/IndexReader instances (and only one Directory) cache will be > built and held in memory redundantly. I don't see any sense in doing so >

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 19:38:48 Shailendra Sharma wrote: > > Is there are particular reason why CachingWrapperFilter caches per > > IndexReader > > and not per IndexReader.directory()? If there are multiple > > IndexSearcher/IndexReader instances (and only one Directory) cache will > > be built

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 19:26:51 Grant Ingersoll wrote: > My guess would be b/c best practice is usually to only have one Reader/ > Searcher per Directory, but I don't know if that is the real reason. > Most discussions/testing I have seen indicate a single Reader/Searcher > performs best. Well

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Mark Miller
The main reason to use a single IndexReader is because its very time consuming to open an IndexReader. If your index is pretty static, maybe this is not much of a concern. Otherwise its a major concern. But lets say its not...then we have to assume your going to have a huge index (otherwise the

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 21:06:06 Mark Miller wrote: > The main reason to use a single IndexReader is because its very time > consuming to open an IndexReader. If your index is pretty static, maybe Yes, it takes quite some time to build it and it's not changed but rebuilt from scratch. > Perha

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Mark Miller
I believe that, in general, you'll find that ParallelMultiSearcher is much slower than just using a MultiSearcher. ParralelMultiSeacher is of use when you can put the different indexes on separate hard drives or even better, separate systems (using RMI or something). - Mark Timo Nentwig wrote

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Timo Nentwig
On Tuesday 01 January 2008 22:24:53 Mark Miller wrote: > I believe that, in general, you'll find that ParallelMultiSearcher is You believe or you know? And if you know why is there a ParallelMultiSearcher at all? :) And I still wonder why everybody belives and finds out on his own why isn't th

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Mark Miller
The reason there is a ParallelMusltiSearcher is because of the reasons given: if you are distributing your index across machines or hard drives, doing things in parallel is fater. I don't think RAID counts. RAID will do the parallelism for you with a single index. I say I believe because its h

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Erick Erickson
On Jan 1, 2008 4:40 PM, Timo Nentwig <[EMAIL PROTECTED]> wrote: > On Tuesday 01 January 2008 22:24:53 Mark Miller wrote: > > I believe that, in general, you'll find that ParallelMultiSearcher is > > You believe or you know? And if you know why is there a > ParallelMultiSearcher > at all? :) > > An

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Grant Ingersoll
On Jan 1, 2008, at 4:40 PM, Timo Nentwig wrote: On Tuesday 01 January 2008 22:24:53 Mark Miller wrote: I believe that, in general, you'll find that ParallelMultiSearcher is You believe or you know? And if you know why is there a ParallelMultiSearcher at all? :) And I still wonder why eve

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-01 Thread Chris Hostetter
: I suggest to use reader.directory() instead of reader as key for the : WeakHashMap. This way multiple IndexSearcher/IndexReacher instances would : share the cache. setting aside discussion of why you should/shouldn't use a single IndexReader, or why the various places in the Lucene code base

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-06 Thread Timo Nentwig
On Wednesday 02 January 2008 08:03:48 Chris Hostetter wrote: > 1) there is a semi-articulated goal of moving away from "under the > coveres" weakref caching to more explicit and controllable caching ... YES! BTW why havin caching been removed from QueryFilter at all? Isn't caching the only sens

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-11 Thread Toke Eskildsen
On Tue, 2008-01-01 at 15:06 -0500, Mark Miller wrote: > Perhaps, in some esoteric case, multiple readers is the right idea > (monster, monster, super IO system, static index?? maybe...)...but > unless you have run into this case and have some data to show it, I > would stick with what the commun

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-12 Thread Otis Gospodnetic
t; To: java-user@lucene.apache.org Sent: Friday, January 11, 2008 5:34:48 AM Subject: Re: CachingWrapperFilter: why cache per IndexReader? On Tue, 2008-01-01 at 15:06 -0500, Mark Miller wrote: > Perhaps, in some esoteric case, multiple readers is the right idea > (monster, monster, super IO system

Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-17 Thread Toke Eskildsen
On Fri, 2008-01-11 at 11:34 +0100, Toke Eskildsen wrote: > As for shared searcher vs. individual searchers, there was just a > slight penalty for using individual searchers. Whoops! Seems like I need better QA for my test-code. I didn't use individual searchers for each thread when I thought I was

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-17 Thread Erick Erickson
There's a section on the Lucene Wiki for real world experiences etc. After you are satisfied with your tests, it'd be great if you could add your measurements to the Wiki! Best Erick On Jan 17, 2008 5:31 AM, Toke Eskildsen <[EMAIL PROTECTED]> wrote: > On Fri, 2008-01-11 at 11:34 +0100, Toke Esk

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-19 Thread Otis Gospodnetic
- Nutch - Original Message From: Toke Eskildsen <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, January 17, 2008 5:31:56 AM Subject: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?) On Fri, 2008-01-11 at 11:34 +0100, Toke Eskildsen wrote

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-20 Thread Michael McCandless
t.com/ -- Lucene - Solr - Nutch - Original Message From: Toke Eskildsen <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thursday, January 17, 2008 5:31:56 AM Subject: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?) On Fri, 2008-01-11 at 11:34

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-20 Thread Mark Miller
rchers (Was: CachingWrapperFilter: why cache per IndexReader?) On Fri, 2008-01-11 at 11:34 +0100, Toke Eskildsen wrote: As for shared searcher vs. individual searchers, there was just a slight penalty for using individual searchers. Whoops! Seems like I need better QA for my test-code. I didn'

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-21 Thread Toke Eskildsen
On Sun, 2008-01-20 at 05:44 -0500, Michael McCandless wrote: > These results are very interesting. With 3 threads on SSD your > searches run 87% faster if you use 3 IndexSearchers instead of > sharing a single one. That is my observation, yes. Please note that this is with Lucene 2.1. I've tr

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-21 Thread Michael McCandless
Toke Eskildsen wrote: On Sun, 2008-01-20 at 05:44 -0500, Michael McCandless wrote: These results are very interesting. With 3 threads on SSD your searches run 87% faster if you use 3 IndexSearchers instead of sharing a single one. That is my observation, yes. Please note that this is with L

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-21 Thread Toke Eskildsen
On Mon, 2008-01-21 at 08:32 -0500, Michael McCandless wrote: > Well that is not good news!! From your results below, it looks like > 2.3 searching is 13.6% slower with hard disks and 8.9% slower with SSD. As can be seen, it depends on the configuration. But the overall picture is very consisten

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-21 Thread Yonik Seeley
On Jan 21, 2008 10:32 AM, Toke Eskildsen <[EMAIL PROTECTED]> wrote: > If we > only look at the forst 50.000 queries, the difference in speed for > Lucene versions using harddisks is negligible. For SSDs it's quite > visible: Hmmm, I have a hard time thinking what could have slowed down searching..

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-21 Thread Michael Busch
Hi Toke, what kind of queries are you using for your tests? (num query terms, booleans clauses, phrases, wildcards?) -Michael Yonik Seeley wrote: > On Jan 21, 2008 10:32 AM, Toke Eskildsen <[EMAIL PROTECTED]> wrote: >> If we >> only look at the forst 50.000 queries, the difference in speed for

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-22 Thread Toke Eskildsen
On Mon, 2008-01-21 at 11:40 -0800, Michael Busch wrote: > what kind of queries are you using for your tests? (num query terms, > booleans clauses, phrases, wildcards?) No numbers (at least not parsed as numbers), no ranges, some wildcards, some phrases. The only non-trivial part of the queries is

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-22 Thread Michael Busch
Thanks for your detailed answer, Toke! Is your default operator AND or OR? Toke Eskildsen wrote: > On Mon, 2008-01-21 at 11:40 -0800, Michael Busch wrote: >> what kind of queries are you using for your tests? (num query terms, >> booleans clauses, phrases, wildcards?) > > No numbers (at least not

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-22 Thread Toke Eskildsen
On Tue, 2008-01-22 at 02:22 -0800, Michael Busch wrote: > Is your default operator AND or OR? AND - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-22 Thread Michael Busch
OK, then Yonik might be right about the multi-level skiplists code which is new in 2.2. I'd love to see the performance numbers of the same index built with 2.3, if possible? You could simply migrate it to 2.3 by using IndexWriter.addIndexes(). In my performance tests (LUCENE-866) I measured an av

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-23 Thread Toke Eskildsen
On Tue, 2008-01-22 at 03:08 -0800, Michael Busch wrote: > OK, then Yonik might be right about the multi-level skiplists code which > is new in 2.2. I'd love to see the performance numbers of the same index > built with 2.3, if possible? You could simply migrate it to 2.3 by using > IndexWriter.addI

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-23 Thread Antony Bowesman
Toke Eskildsen wrote: == Average over the first 50.000 queries == metis_flash_RAID0_8GB_i37_t2_l21.log - 279.6 q/sec metis_flash_RAID0_8GB_i37_t2_l23.log - 202.3 q/sec metis_flash_RAID0_8GB_i37_v23_t2_l23.log - 195.9 q/sec == Average over the first 340.000 queries == metis_flash_RAID0_8GB_i37

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-24 Thread Toke Eskildsen
On Thu, 2008-01-24 at 08:18 +1100, Antony Bowesman wrote: > These are odd. The last case in both of the above shows a slowdown compared > to > 2.1 index and version and in the first 50K queries, the 2.3 index and version > is > even slower than 2.3 with 2.1 index. It catches up in the longer

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-01-30 Thread Otis Gospodnetic
atext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> - Original Message ---- >> From: Toke Eskildsen <[EMAIL PROTECTED]> >> To: java-user@lucene.apache.org >> Sent: Thursday, January 17, 2008 5:31:56 AM >> Subject: Mu

Re: Multiple searchers (Was: CachingWrapperFilter: why cache per IndexReader?)

2008-02-22 Thread Toke Eskildsen
On Thu, 2008-01-17 at 10:24 -0500, Erick Erickson wrote: > There's a section on the Lucene Wiki for real world > experiences etc. After you are satisfied with your > tests, it'd be great if you could add your measurements > to the Wiki! Could you please point me to the page? I am unable to find it