Dmitri Bichko wrote:
32 cores, actually :)
Glossed over that - even better! Killer machine to be able to test this on.
I reran the test with readonly turned on (I changed how the time is
measured a little, it should be more consistent):

        fs-thread       ram-thread      fs-shared       ram-shared
1       71877   54739   73986   61595
2       34949   26735   43719   28935
3       25581   26885   38412   19624
4       20511   31742   38712   15059
5       19235   24345   39685   12509
6       16775   26896   39592   10841
7       17147   18296   46678   10183
8       18327   19043   39886   10048
9       16885   18721   40342   9483
10      17832   30757   44706   10975
11      17251   21199   39947   9704
12      17267   36284   40208   10996

I can't seem to get NIOFSDirectory working, though.  Calling
NIOFSDirectory.getDirectory("foo") just returns an FSDirectory.
Thats a good point, and points out a bug in solr trunk for me. Frankly I don't see how its done. There is no code I can see/find to use it rather than FSDirectory. Still assuming there must be a way, but I don't see it...

- Mark
Any ideas?

Cheers,
Dmitri

On Tue, Nov 11, 2008 at 5:09 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
Nice! An 8 core machine with a test ready to go!

How about trying the read only mode that was added to 2.4 on your
IndexReader?

And if you you are on unix and could try trunk and use the new
NIOFSDirectory implementation...that would be awesome.

Those two additions are our current hope for what your seeing...would be
nice to know if we need to try for more (or if we need to petition the smart
people that work on that stuff to try for more ;) ).

- Mark

Dmitri Bichko wrote:
Hi,

I'm pretty new to Lucene, so please bear with me if this has been
covered before.

The wiki suggests sharing a single IndexSearcher between threads for
best performance
(http://wiki.apache.org/lucene-java/ImproveSearchingSpeed).  I've
tested running the same set of queries with: multiple threads sharing
the same searcher, with a separate searcher for each thread, both
shared/private with a RAMDirectory in-memory index, and (just for fun)
in multiple JVMs running concurrently (the results are in milliseconds
to complete the whole job):

threads  multi-jvm  shared  per-thread  ram-shared  ram-thread
     1      72997   70883       72573       60308       60012
     2      33147   48762       35973       25498       25734
     4      16229   46828       21267       13127       27164
     6      13088   47240       14028        9858       29917
     8       9775   47020       10983        8948       10440
    10       8721   50132       11334        9587       11355
    12       7290   49002       11798        9832
    16       9365   47099       12338       11296

The shared searcher indeed behaves better with a ram-based index, but
what's going on with the disk-based one?  It's basically not scaling
beyond two threads. Am I just doing something completely wrong here?

The test consists of about 1,500 Boolean OR queries with 1-10
PhraseQueries each, with 1-20 Terms per PhraseQuery.  I'm using a
HitCollector to count the hits, so I'm not retrieving any results.
The index is about 5GB and 20 million documents.

This is running on a 8 x quad-core Opteron machine with plenty of RAM to
spare.

Any idea why I would see this behaviour?

Thanks,
Dmitri

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to