This is interesting. Very large heaps can sometimes cause an expensive gc cycle ("Can heap be too big?" - http://www.javaperformancetuning.com/news/qotm045.shtml) and different memory allocation patterns between 2.0 and 2.2 could I think play in too, so it would be interesting to know the numbers with smaller heap sizes.
A few more questions: - Readers reuse: Are all searches of the same thread sharing searchers/readers? Are different threads sharing searchers/readers? - What happens with a single thread? - Is this degradation visible also by single queries, or are some queries faster in 2.0 and some in 2.2? Thanks, Doron Otis Gospodnetic <[EMAIL PROTECTED]> wrote on 06/03/2007 08:16:16: > Hi, > > I'm doing some Lucene search benchmarking (got to love massive query > logs :)) and have 2 questions: > > 1) Has anyone compared Lucene 2.0 and 2.2-dev? My benchmarks found > 2.2-dev (freshly baked) to be somewhat slower than 2.0, despite all > those performance improvements (see CHANGES.txt)... Has anyone else > done the comparison? My queries are a mixture of 2-3 required > keywords (majority) and phrase queries with 2-3 keywords. > > To give you an idea about how much slower 2.2-dev is for me, here > are some counts for queries I considered slow (> 1s latency) during > my benchmark with 8 concurrent search threads and then 64 threads: > > > $ grep -c SLOW 5-shard-log-2.0/8.log > 1183 > $ grep -c SLOW 5-shard-log-2.2-dev/8.log > 5479 > > $ grep -c SLOW 5-shard-log-2.0/64.log > 28657 > $ grep -c SLOW 5-shard-log-2.2-dev/64.log > 33459 > > This is of a total of 100K queries. > > 2) My benchmark was against 5 optimized compound Lucene indices, > about 9GB each, on a box with 32GB of RAM and several CPUs. I gave > the JVM 22GB with Xms and Xmx. However, I am wondering if giving it > that much is actually smart. While I'm letting JVM use more RAM, > I'm taking it away from the OS for FS caching. So, I'm now thinking > about running the same benchmark, but with a smaller max heap. But > how much should I give it? I'm thinking about adding up sizes of > all .tii files, adding some padding for the JVM, GC, etc., and using > that. Is there anything else I should consider here? > > So I looked at one of the .cfs files: > > _0.f0: 11164467 bytes > ... other fields, same size, of course > _0.fdt: 381343723 bytes > _0.fdx: 89315736 bytes > _0.fnm: 78 bytes > _0.frq: 4591955197 bytes > _0.prx: 4242807266 bytes > _0.tii: 11498861 bytes > _0.tis: 829868070 bytes > > > Here, the .tii file is only about 11 MB. That looks awfully small! > There is no way 5 x 11 MB + padding will be enough. Should I be > adding the size of some other file(s)? .tis perhaps? > > Thanks, > Otis --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]