I re-ran the no-readonly ram tests: thread shared 1 64043 53610 2 26999 25260 3 27173 17265 4 22205 13222 5 20795 11098 6 17593 9852 7 17163 8987 8 17275 9052 9 19392 10266 10 27809 10397 11 25987 10724 12 26550 10832
The pattern is the same, but the difference at 4 and 6 is less pronounced - it was probably just a hiccup (I'm not using terribly sophisticated test methodology here), it's also possible I didn't give the JVM enough RAM (this run was with 16GB, just to be on the safe side). Still, looks like the extra resource management overhead for ram-thread beats whatever lock-contention ram-shared introduces. I'm rerunning everything with readonly set and nio, I'll post the results once it's done. Cheers, Dmitri On Tue, Nov 11, 2008 at 5:40 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: > > Nice results, thanks! > > The poor disk-based scaling may be fixed by NIOFSDirectory, if you are on > Unix. If you are on Windows it won't help (and will likely be worse than > FSDirectory), because of an apparently bug in Sun's JVM on Windows whereby > NIO positional file reads seem to share a lock under the hood. > > The poor ram-thread result for 4 & 6 threads is odd. Those numbers ought > to be at least as good as ram-shared. Is it possible those columns are > swapped? Because the ram-shared case should have been hurt by using a > non-read-only IndexReader. > > Mike > > Dmitri Bichko wrote: > >> Hi, >> >> I'm pretty new to Lucene, so please bear with me if this has been >> covered before. >> >> The wiki suggests sharing a single IndexSearcher between threads for >> best performance >> (http://wiki.apache.org/lucene-java/ImproveSearchingSpeed). I've >> tested running the same set of queries with: multiple threads sharing >> the same searcher, with a separate searcher for each thread, both >> shared/private with a RAMDirectory in-memory index, and (just for fun) >> in multiple JVMs running concurrently (the results are in milliseconds >> to complete the whole job): >> >> threads multi-jvm shared per-thread ram-shared ram-thread >> 1 72997 70883 72573 60308 60012 >> 2 33147 48762 35973 25498 25734 >> 4 16229 46828 21267 13127 27164 >> 6 13088 47240 14028 9858 29917 >> 8 9775 47020 10983 8948 10440 >> 10 8721 50132 11334 9587 11355 >> 12 7290 49002 11798 9832 >> 16 9365 47099 12338 11296 >> >> The shared searcher indeed behaves better with a ram-based index, but >> what's going on with the disk-based one? It's basically not scaling >> beyond two threads. Am I just doing something completely wrong here? >> >> The test consists of about 1,500 Boolean OR queries with 1-10 >> PhraseQueries each, with 1-20 Terms per PhraseQuery. I'm using a >> HitCollector to count the hits, so I'm not retrieving any results. >> The index is about 5GB and 20 million documents. >> >> This is running on a 8 x quad-core Opteron machine with plenty of RAM to >> spare. >> >> Any idea why I would see this behaviour? >> >> Thanks, >> Dmitri >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]