1> That's hard-coded at present. There's anecdotal evidence that there are throughput improvements with larger batch sizes, but no action yet. 2> Yep, all searchers are also re-opened, caches re-warmed, etc. 3> Odd. I'm assuming your Solr3 was master/slave setup? Seeing the queries would help diagnose this. Also, did you try to copy/paste the configuration from your Solr3 to Solr4? I'd start with the Solr4 and copy/paste only the parts needed from your SOlr3 setup.
Best Erick On Mon, Aug 12, 2013 at 11:38 AM, Joshi, Shital <shital.jo...@gs.com> wrote: > Hi, > > We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes > with about 450 mil documents (~90 mil per shard). We're loading 1000 or > less documents in CSV format every few minutes. In Solr3, with 300 mil > documents, it used to take 30 seconds to load 1000 documents while in > Solr4, its taking up to 3 minutes to load 1000 documents. We're using > custom sharding, we include _shard_=shardid parameter in update command. > Upon looking Solr4 log files we found that: > > 1. Documents are added in a batch of 10 records. How do we increase > this batch size from 10 to 1000 documents? > > 2. We do hard commit after loading 1000 documents. For every hard > commit, it refreshes searcher on all nodes. Are all caches also refreshed > when hard commit happens? We're planning to change to soft commit and do > auto hard commit every 10-15 minutes. > > 3. We're not seeing improved query performance compared to Solr3. > Queries which took 3-5 seconds in Solr3 (300 mil docs) are taking 20 > seconds with Solr4. We think this could be due to frequent hard commits and > searcher refresh. Do you think when we change to soft commit and increase > the batch size, we will see better query performance. > > Thanks! > > >