Re: IndexSearcher and multi-threaded performance

Michael McCandless Tue, 11 Nov 2008 14:41:26 -0800


Nice results, thanks!

The poor disk-based scaling may be fixed by NIOFSDirectory, if you areon Unix. If you are on Windows it won't help (and will likely beworse than FSDirectory), because of an apparently bug in Sun's JVM onWindows whereby NIO positional file reads seem to share a lock underthe hood.

The poor ram-thread result for 4 & 6 threads is odd. Those numbersought to be at least as good as ram-shared. Is it possible thosecolumns are swapped? Because the ram-shared case should have beenhurt by using a non-read-only IndexReader.


Mike

Dmitri Bichko wrote:

Hi,

I'm pretty new to Lucene, so please bear with me if this has been
covered before.

The wiki suggests sharing a single IndexSearcher between threads for
best performance
(http://wiki.apache.org/lucene-java/ImproveSearchingSpeed).  I've
tested running the same set of queries with: multiple threads sharing
the same searcher, with a separate searcher for each thread, both
shared/private with a RAMDirectory in-memory index, and (just for fun)
in multiple JVMs running concurrently (the results are in milliseconds
to complete the whole job):

threads  multi-jvm  shared  per-thread  ram-shared  ram-thread
     1      72997   70883       72573       60308       60012
     2      33147   48762       35973       25498       25734
     4      16229   46828       21267       13127       27164
     6      13088   47240       14028        9858       29917
     8       9775   47020       10983        8948       10440
    10       8721   50132       11334        9587       11355
    12       7290   49002       11798        9832
    16       9365   47099       12338       11296

The shared searcher indeed behaves better with a ram-based index, but
what's going on with the disk-based one?  It's basically not scaling
beyond two threads. Am I just doing something completely wrong here?

The test consists of about 1,500 Boolean OR queries with 1-10
PhraseQueries each, with 1-20 Terms per PhraseQuery.  I'm using a
HitCollector to count the hits, so I'm not retrieving any results.
The index is about 5GB and 20 million documents.

This is running on a 8 x quad-core Opteron machine with plenty ofRAM to spare.


Any idea why I would see this behaviour?

Thanks,
Dmitri

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: IndexSearcher and multi-threaded performance

Reply via email to