Re: [pylucene-dev] Help understanding "performance" issues.

Rune Hansen Thu, 22 Feb 2007 04:32:28 -0800


On 21. feb. 2007, at 19.56, Andi Vajda wrote:

On Wed, 21 Feb 2007, Rune Hansen wrote:
I've set up a Multisearcher* inside a patched cherrypy 3.0 server(patched with PythonThreads).Using a Queue, I've created searchables (MultiSearchcer spanning10 indexes with approximately 900.000 documents combined) whichare available through cherrypys .thread_data facility for theservers 10 threads.
When timing a search of medium complexity, one searchable returnsafter ~0.3 seconds.The optimum seems to be to create two searchables, it does notproduce higher throughput when I increase the number ofsearchables to three or more, it actually slows all the requestsdown. If I reduce the number of searchables to one, it willproduce half the throughput of two searchables.
For example:
ab -n100 -c8 on one searchable available to 10 threads : Requestsper second: 1.66 [#/sec] (mean)ab -n100 -c8 on two searchables available to 10 threads : Requestsper second: 3.05 [#/sec] (mean)ab -n100 -c8 on three searchables available to 10 threads :Requests per second: 2.98 [#/sec] (mean)ab -n100 -c8 on four searchables available to 10 threads :Requests per second: 2.95 [#/sec] (mean)
(average of 5 runs on each)
I have a hard time understanding this behavior. Is it because ofhow Lucene accesses a IndexReader? Is it because of hardwarelimitations? Can in be programmed "smarter" at my end?
I'm not sure. There have been many threads about this on java-[EMAIL PROTECTED] A bunch of work was done in the area oflocks and indexes in Lucene 2.1, so I'd try to upgrade to PyLucene2.1 as well.
Andi..


Hi Andi,

I compiled PyLucene 2.1 with gcj-3.4.6 -linux (tried 4.1.2 but thatdidn't work).


I get a slightly higher throughput
One searchable : Requests per second:    2.17 [#/sec] (mean)
Two Searchables : Requests per second:    3.57 [#/sec] (mean)
Three Searcheables: Requests per second:    3.46 [#/sec] (mean)
...and so on.

Experiment one:

Two search servers with a load balancer in front. The search serverscreates two searchables each from the _same_ index directory.Dispatcher and the two search servers are running on the same machine(2x3.2ghz Xeon/2GB Ram)

Requests per second:    3.98 [#/sec] (mean)
-> A negligible  speedup.

Experiment two:

Two search servers with a load balancer in front. The search serverscreates two searchables each from _different_ index directories.Dispatcher and the two search servers are running on the same machine(2x3.2ghz Xeon/2GB Ram)

Requests per second:    3.98 [#/sec] (mean)
-> Same negligible  speedup.

Experiment Three:

Two search servers each on a separate but otherwise identical machinewith a load balancer in front. (2x3.2ghz Xeon/2GB Ram)

Requests per second:    7.12 [#/sec] (mean)
-> As expected

What I have learned so far is:
1) Timing with ab works :)
2) Hardware matters, a lot!

3) IndexReader may or may not be able to do simultaneous access to aindex directory

The conclusion is that I haven't been able to draw a conclusion frommy tests. If i've had access to a quad server with 4GB ram I mighthave been able to say something about Lucenes efficiency.


regards
/rune

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Re: [pylucene-dev] Help understanding "performance" issues.

Reply via email to