This may or may not help but here goes :) When i was running performance tests i look a look at the simple post tool that comes with the solr examples.
First i changed my schema.xml to fit my needs and then i deleted the old index so solr created a blank one when i started up. Then i had a had a process chew on my data and spit out xml files that are formatted similarly to the xml files that the SimplePostTool example uses. Next i used the simple Post tool to post the xml files to solr (60k-80k records per xml file). Each file only took a couple minutes to index this way. Comit and optimize after that (took less then 10 minutes) and after about 2.5 hrs i had indexed just under 8 milion records. This was on a 4 year old single core laptop using resin 3 as my servlet container. Hope this helps. On Fri, Sep 25, 2009 at 3:51 AM, Lance Norskog <goks...@gmail.com> wrote: > In "top", press the '1' key. This will give a list of the CPUs and how > much load is on each. The display is otherwise a little weird for > multi-cpu machines. But don't be surprised when Solr is I/O bound. The > biggest fanciest RAID is often a better investment than CPUs. On one > project we bought low-end rack servers come with 6-8 disk bays, > filling them with 10k/15k RPM disks. > > On Wed, Sep 23, 2009 at 2:47 PM, Dan A. Dickey <dan.dic...@savvis.net> > wrote: > > On Friday 11 September 2009 11:06:20 am Dan A. Dickey wrote: > > ... > >> Our JBoss expert and I will be looking into why this might be occurring. > >> Does anyone know of any JBoss related slowness with Solr? > >> And does anyone have any other sort of suggestions to speed indexing > >> performance? Thanks for your help all! I'll keep you up to date with > >> further progress. > > > > Ok, further progress... just to keep any interested parties up to date > > and for the record... > > > > I'm finding that using the "example" jetty setup (will be switching very > > very soon to a "real" jetty installation) is about the fastest. Using > > several processes to send posts to Solr helps a lot, and we're seeing > > about 80 posts a second this way. > > > > We also stripped down JBoss to the bare bones and the Solr in it > > is running nearly as fast - about 50 posts a second. It was our previous > > JBoss configuration that was making it appear "slow" for some reason. > > > > We will be running more tests and spreading out the "pre-index" workload > > across more machines and more processes. In our case we were seeing > > the bottleneck being one machine running 18 processes. > > The 2 quad core xeon system is experiencing about a 25% cpu load. > > And I'm not certain, but I think this may be actually 25% of one of the 8 > cores. > > So, there's *lots* of room for Solr to be doing more work there. > > -Dan > > > > -- > > Dan A. Dickey | Senior Software Engineer > > > > Savvis > > 10900 Hampshire Ave. S., Bloomington, MN 55438 > > Office: 952.852.4803 | Fax: 952.852.4951 > > E-mail: dan.dic...@savvis.net > > > > > > -- > Lance Norskog > goks...@gmail.com >