This is interesting.

Very large heaps can sometimes cause an expensive gc cycle ("Can heap be
too big?" - http://www.javaperformancetuning.com/news/qotm045.shtml) and
different memory allocation patterns between 2.0 and 2.2 could I think play
in too, so it would be interesting to know the numbers with smaller heap
sizes.

A few more questions:
- Readers reuse: Are all searches of the same thread sharing
searchers/readers? Are different threads sharing searchers/readers?
- What happens with a single thread?
- Is this degradation visible also by single queries, or are some queries
faster in 2.0 and some in 2.2?

Thanks,
Doron

Otis Gospodnetic <[EMAIL PROTECTED]> wrote on 06/03/2007 08:16:16:

> Hi,
>
> I'm doing some Lucene search benchmarking (got to love massive query
> logs :)) and have 2 questions:
>
> 1) Has anyone compared Lucene 2.0 and 2.2-dev?  My benchmarks found
> 2.2-dev (freshly baked) to be somewhat slower than 2.0, despite all
> those performance improvements (see CHANGES.txt)... Has anyone else
> done the comparison?  My queries are a mixture of 2-3 required
> keywords (majority) and phrase queries with 2-3 keywords.
>
> To give you an idea about how much slower 2.2-dev is for me, here
> are some counts for queries I considered slow (> 1s latency) during
> my benchmark with 8 concurrent search threads and then 64 threads:
>
>
> $ grep -c SLOW 5-shard-log-2.0/8.log
> 1183
> $ grep -c SLOW 5-shard-log-2.2-dev/8.log
> 5479
>
> $ grep -c SLOW 5-shard-log-2.0/64.log
> 28657
> $ grep -c SLOW 5-shard-log-2.2-dev/64.log
> 33459
>
> This is of a total of 100K queries.
>
> 2) My benchmark was against 5 optimized compound Lucene indices,
> about 9GB each, on a box with 32GB of RAM and several CPUs.  I gave
> the JVM 22GB with Xms and Xmx.  However, I am wondering if giving it
> that much is actually smart.  While I'm letting JVM use more RAM,
> I'm taking it away from the OS for FS caching.  So, I'm now thinking
> about running the same benchmark, but with a smaller max heap.  But
> how much should I give it?  I'm thinking about adding up sizes of
> all .tii files, adding some padding for the JVM, GC, etc., and using
> that.  Is there anything else I should consider here?
>
> So I looked at one of the .cfs files:
>
> _0.f0: 11164467 bytes
> ... other fields, same size, of course
> _0.fdt: 381343723 bytes
> _0.fdx: 89315736 bytes
> _0.fnm: 78 bytes
> _0.frq: 4591955197 bytes
> _0.prx: 4242807266 bytes
> _0.tii: 11498861 bytes
> _0.tis: 829868070 bytes
>
>
> Here, the .tii file is only about 11 MB.  That looks awfully small!
> There is no way 5 x 11 MB + padding will be enough.  Should I be
> adding the size of some other file(s)?  .tis perhaps?
>
> Thanks,
> Otis


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to