Dmitri Bichko wrote:
32 cores, actually :)
Glossed over that - even better! Killer machine to be able to test this on.
I reran the test with readonly turned on (I changed how the time is
measured a little, it should be more consistent):
fs-thread ram-thread fs-shared ram-shared
1 71877 54739 73986 61595
2 34949 26735 43719 28935
3 25581 26885 38412 19624
4 20511 31742 38712 15059
5 19235 24345 39685 12509
6 16775 26896 39592 10841
7 17147 18296 46678 10183
8 18327 19043 39886 10048
9 16885 18721 40342 9483
10 17832 30757 44706 10975
11 17251 21199 39947 9704
12 17267 36284 40208 10996
I can't seem to get NIOFSDirectory working, though. Calling
NIOFSDirectory.getDirectory("foo") just returns an FSDirectory.
Thats a good point, and points out a bug in solr trunk for me. Frankly I
don't see how its done. There is no code I can see/find to use it rather
than FSDirectory. Still assuming there must be a way, but I don't see it...
- Mark
Any ideas?
Cheers,
Dmitri
On Tue, Nov 11, 2008 at 5:09 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
Nice! An 8 core machine with a test ready to go!
How about trying the read only mode that was added to 2.4 on your
IndexReader?
And if you you are on unix and could try trunk and use the new
NIOFSDirectory implementation...that would be awesome.
Those two additions are our current hope for what your seeing...would be
nice to know if we need to try for more (or if we need to petition the smart
people that work on that stuff to try for more ;) ).
- Mark
Dmitri Bichko wrote:
Hi,
I'm pretty new to Lucene, so please bear with me if this has been
covered before.
The wiki suggests sharing a single IndexSearcher between threads for
best performance
(http://wiki.apache.org/lucene-java/ImproveSearchingSpeed). I've
tested running the same set of queries with: multiple threads sharing
the same searcher, with a separate searcher for each thread, both
shared/private with a RAMDirectory in-memory index, and (just for fun)
in multiple JVMs running concurrently (the results are in milliseconds
to complete the whole job):
threads multi-jvm shared per-thread ram-shared ram-thread
1 72997 70883 72573 60308 60012
2 33147 48762 35973 25498 25734
4 16229 46828 21267 13127 27164
6 13088 47240 14028 9858 29917
8 9775 47020 10983 8948 10440
10 8721 50132 11334 9587 11355
12 7290 49002 11798 9832
16 9365 47099 12338 11296
The shared searcher indeed behaves better with a ram-based index, but
what's going on with the disk-based one? It's basically not scaling
beyond two threads. Am I just doing something completely wrong here?
The test consists of about 1,500 Boolean OR queries with 1-10
PhraseQueries each, with 1-20 Terms per PhraseQuery. I'm using a
HitCollector to count the hits, so I'm not retrieving any results.
The index is about 5GB and 20 million documents.
This is running on a 8 x quad-core Opteron machine with plenty of RAM to
spare.
Any idea why I would see this behaviour?
Thanks,
Dmitri
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]