Hi Scott, the unit tests are also a good performance test. But to compare your directory with another one, be sure to: - use a defined directory instance to compare. The most performant Lucene one is: -Dtests.directory=MMapDirectory - so compare you results with that one. If you don't define a diferectly, it uses RAMDirectory in most cases. - use a defined random seed when comparing results. Lucene tests randomize a lot. Randomization can be prevented by explicitely stating a given random seed (one example is given on startup). Also run "ant test-help" to get more usage help. - to do more stress testing - this will create larger indexes: -Dtests.nightly=true - use a single JVM: -Dtests.jvms=1
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Scott Schneider [mailto:scott_schnei...@symantec.com] > Sent: Friday, January 24, 2014 2:41 AM > To: java-user@lucene.apache.org > Subject: RE: Performance testing Lucene > > Thanks! I ran this Directory subclass through the Lucene unit tests (and > found 3 race conditions). Unit tests are wonderful. > > Scott > > > > -----Original Message----- > > From: Michael McCandless [mailto:luc...@mikemccandless.com] > > Sent: Wednesday, January 22, 2014 7:05 AM > > To: Lucene Users > > Subject: Re: Performance testing Lucene > > > > All the source code for the nightly Lucene perf tests I run ( > > http://people.apache.org/~mikemccand/lucenebench/ ) are here: > > https://code.google.com/a/apache-extras.org/p/luceneutil/ > > > > These are also the scripts I use for A/B performance tests for a new > > patch. > > > > It's somewhat tricky getting those Python scripts set up to run ... > > but I think it'd be a good way to smoke test your new Directory. > > > > The queries are "synthetic"; it's a real problem, not having a real > > world, biggish corpus plus real queries, for better performance > > testing... > > > > Mike McCandless > > > > http://blog.mikemccandless.com > > > > > > On Mon, Jan 20, 2014 at 11:22 PM, Scott Schneider > > <scott_schnei...@symantec.com> wrote: > > > Hello, > > > > > > Would you folks mind giving me a few tips on performance testing > > Lucene? I want to test the performance impact of a Directory subclass. > > > > > > What is a good testing tool to use? I don't see a great way to get > > SolrMeter to run the max # updates/minute and measure throughput that > > way. When I set the # updates/minute to a large #, SolrMeter logs > > NullPointerExceptions. (I assume these are within SolrMeter, as I > > don't see errors in Solr.) Mike McCandless's nightly Lucene > > performance tests look good, though I've only just started looking at > > it. > > > > > > Are there any particularly standard or good test sets? I'd like to > > test 3 scenarios: indexing only, querying only, and indexing plus > > querying. McCandless's indexing test uses wikipedia, which seems > > great, but he has a slew of tests that are each specific to some > > querying feature. I'd like a single, general query test. It's not > > hard to come up with a decent set of queries, but I'd really like > > something representative of real world queries. If there some > > standard set of commonly used queries, that would be ideal. > > > > > > Thanks! > > > > > > Scott > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org