I re-ran the no-readonly ram tests:
thread shared
1 64043 53610
2 26999 25260
3 27173 17265
4 22205 13222
5 20795 11098
6 17593 9852
7 17163 8987
8 17275 9052
9 19392 10266
10 27809 10397
11 25987 10724
12 26550 10832
The pattern is the same, but the difference at 4 and 6 is less
pronounced - it was probably just a hiccup (I'm not using terribly
sophisticated test methodology here), it's also possible I didn't give
the JVM enough RAM (this run was with 16GB, just to be on the safe
side).
Still, looks like the extra resource management overhead for
ram-thread beats whatever lock-contention ram-shared introduces.
I'm rerunning everything with readonly set and nio, I'll post the
results once it's done.
Cheers,
Dmitri
On Tue, Nov 11, 2008 at 5:40 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> Nice results, thanks!
>
> The poor disk-based scaling may be fixed by NIOFSDirectory, if you are on
> Unix. If you are on Windows it won't help (and will likely be worse than
> FSDirectory), because of an apparently bug in Sun's JVM on Windows whereby
> NIO positional file reads seem to share a lock under the hood.
>
> The poor ram-thread result for 4 & 6 threads is odd. Those numbers ought
> to be at least as good as ram-shared. Is it possible those columns are
> swapped? Because the ram-shared case should have been hurt by using a
> non-read-only IndexReader.
>
> Mike
>
> Dmitri Bichko wrote:
>
>> Hi,
>>
>> I'm pretty new to Lucene, so please bear with me if this has been
>> covered before.
>>
>> The wiki suggests sharing a single IndexSearcher between threads for
>> best performance
>> (http://wiki.apache.org/lucene-java/ImproveSearchingSpeed). I've
>> tested running the same set of queries with: multiple threads sharing
>> the same searcher, with a separate searcher for each thread, both
>> shared/private with a RAMDirectory in-memory index, and (just for fun)
>> in multiple JVMs running concurrently (the results are in milliseconds
>> to complete the whole job):
>>
>> threads multi-jvm shared per-thread ram-shared ram-thread
>> 1 72997 70883 72573 60308 60012
>> 2 33147 48762 35973 25498 25734
>> 4 16229 46828 21267 13127 27164
>> 6 13088 47240 14028 9858 29917
>> 8 9775 47020 10983 8948 10440
>> 10 8721 50132 11334 9587 11355
>> 12 7290 49002 11798 9832
>> 16 9365 47099 12338 11296
>>
>> The shared searcher indeed behaves better with a ram-based index, but
>> what's going on with the disk-based one? It's basically not scaling
>> beyond two threads. Am I just doing something completely wrong here?
>>
>> The test consists of about 1,500 Boolean OR queries with 1-10
>> PhraseQueries each, with 1-20 Terms per PhraseQuery. I'm using a
>> HitCollector to count the hits, so I'm not retrieving any results.
>> The index is about 5GB and 20 million documents.
>>
>> This is running on a 8 x quad-core Opteron machine with plenty of RAM to
>> spare.
>>
>> Any idea why I would see this behaviour?
>>
>> Thanks,
>> Dmitri
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]