Nice results, thanks!

The poor disk-based scaling may be fixed by NIOFSDirectory, if you are on Unix. If you are on Windows it won't help (and will likely be worse than FSDirectory), because of an apparently bug in Sun's JVM on Windows whereby NIO positional file reads seem to share a lock under the hood.

The poor ram-thread result for 4 & 6 threads is odd. Those numbers ought to be at least as good as ram-shared. Is it possible those columns are swapped? Because the ram-shared case should have been hurt by using a non-read-only IndexReader.

Mike

Dmitri Bichko wrote:

Hi,

I'm pretty new to Lucene, so please bear with me if this has been
covered before.

The wiki suggests sharing a single IndexSearcher between threads for
best performance
(http://wiki.apache.org/lucene-java/ImproveSearchingSpeed).  I've
tested running the same set of queries with: multiple threads sharing
the same searcher, with a separate searcher for each thread, both
shared/private with a RAMDirectory in-memory index, and (just for fun)
in multiple JVMs running concurrently (the results are in milliseconds
to complete the whole job):

threads  multi-jvm  shared  per-thread  ram-shared  ram-thread
     1      72997   70883       72573       60308       60012
     2      33147   48762       35973       25498       25734
     4      16229   46828       21267       13127       27164
     6      13088   47240       14028        9858       29917
     8       9775   47020       10983        8948       10440
    10       8721   50132       11334        9587       11355
    12       7290   49002       11798        9832
    16       9365   47099       12338       11296

The shared searcher indeed behaves better with a ram-based index, but
what's going on with the disk-based one?  It's basically not scaling
beyond two threads. Am I just doing something completely wrong here?

The test consists of about 1,500 Boolean OR queries with 1-10
PhraseQueries each, with 1-20 Terms per PhraseQuery.  I'm using a
HitCollector to count the hits, so I'm not retrieving any results.
The index is about 5GB and 20 million documents.

This is running on a 8 x quad-core Opteron machine with plenty of RAM to spare.

Any idea why I would see this behaviour?

Thanks,
Dmitri

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to