Hi, We have SolrCloud (4.4.0) cluster (5 shards and 2 replicas) on 10 boxes with about 450 mil documents (~90 mil per shard). We're loading 1000 or less documents in CSV format every few minutes. In Solr3, with 300 mil documents, it used to take 30 seconds to load 1000 documents while in Solr4, its taking up to 3 minutes to load 1000 documents. We're using custom sharding, we include _shard_=shardid parameter in update command. Upon looking Solr4 log files we found that:
1. Documents are added in a batch of 10 records. How do we increase this batch size from 10 to 1000 documents? 2. We do hard commit after loading 1000 documents. For every hard commit, it refreshes searcher on all nodes. Are all caches also refreshed when hard commit happens? We're planning to change to soft commit and do auto hard commit every 10-15 minutes. 3. We're not seeing improved query performance compared to Solr3. Queries which took 3-5 seconds in Solr3 (300 mil docs) are taking 20 seconds with Solr4. We think this could be due to frequent hard commits and searcher refresh. Do you think when we change to soft commit and increase the batch size, we will see better query performance. Thanks!