We didn't copy/paste Solr3 config to solr4. We started with Solr4 config and only updated new searcher queries and few other things.
There is no batching while updating/inserting documents in Solr3, is that correct? Committing 1000 documents in Solr3 takes 19 seconds while in Solr4 it takes about 3-4 minutes. We noticed in Solr4 logs that, commit only returns after new searcher is created across all nodes. This is possibly cause waitSearcher=true by default in Solr4. This was not the case with Solr3, commit would return without waiting for new searcher creation. In order to improve performance with Solr4, we first changed from commit=true to commit=false in update URL and added autoHardCommit setting in solrconfig.xml. This improved performance from 3-4 minutes to 1-2 minutes but that is not good enough. Then we changed maxBufferedAddsPerServer value in SolrCmdDistributor class from 10 to 1000 and deployed this class in $JETTY_TEMP_FOLDER/solr-webapp/webapp/WEB-INF/classes folder and restarted solr4 nodes. But we still see the batch size of 10 being used. Did we change correct variable/class? Next thing We will try using softCommit=true in update url and check if it gives us desired performance. Thanks for looking into this. Appreciate your help. -----Original Message----- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, August 13, 2013 8:12 AM To: solr-user@lucene.apache.org Subject: Re: Solr4 update and query performance question 1> That's hard-coded at present. There's anecdotal evidence that there are throughput improvements with larger batch sizes, but no action yet. 2> Yep, all searchers are also re-opened, caches re-warmed, etc. 3> Odd. I'm assuming your Solr3 was master/slave setup? Seeing the queries would help diagnose this. Also, did you try to copy/paste the configuration from your Solr3 to Solr4? I'd start with the Solr4 and copy/paste only the parts needed from your SOlr3 setup. Best Erick On Mon, Aug 12, 2013 at 11:38 AM, Joshi, Shital <shital.jo...@gs.com> wrote: > Hi, > > We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes > with about 450 mil documents (~90 mil per shard). We're loading 1000 or > less documents in CSV format every few minutes. In Solr3, with 300 mil > documents, it used to take 30 seconds to load 1000 documents while in > Solr4, its taking up to 3 minutes to load 1000 documents. We're using > custom sharding, we include _shard_=shardid parameter in update command. > Upon looking Solr4 log files we found that: > > 1. Documents are added in a batch of 10 records. How do we increase > this batch size from 10 to 1000 documents? > > 2. We do hard commit after loading 1000 documents. For every hard > commit, it refreshes searcher on all nodes. Are all caches also refreshed > when hard commit happens? We're planning to change to soft commit and do > auto hard commit every 10-15 minutes. > > 3. We're not seeing improved query performance compared to Solr3. > Queries which took 3-5 seconds in Solr3 (300 mil docs) are taking 20 > seconds with Solr4. We think this could be due to frequent hard commits and > searcher refresh. Do you think when we change to soft commit and increase > the batch size, we will see better query performance. > > Thanks! > > >