If you are issuing writes to shard non-leaders, then there is a large overhead for the eventual redirect to the leader. I noticed a 3-5 times performance increase by making my write client leader aware.
On Oct 30, 2014, at 2:56 PM, Ian Rose <ianr...@fullstory.com> wrote: >> >> If you want to increase QPS, you should not be increasing numShards. >> You need to increase replicationFactor. When your numShards matches the >> number of servers, every single server will be doing part of the work >> for every query. > > > > I think this is true only for actual queries, right? I am not issuing any > queries, only writes (document inserts). In the case of writes, increasing > the number of shards should increase my throughput (in ops/sec) more or > less linearly, right? > > > On Thu, Oct 30, 2014 at 4:50 PM, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 10/30/2014 2:23 PM, Ian Rose wrote: >>> My methodology is as follows. >>> 1. Start up a K solr servers. >>> 2. Remove all existing collections. >>> 3. Create N collections, with numShards=K for each. >>> 4. Start load testing. Every minute, print the number of successful >>> updates and the number of failed updates. >>> 5. Keep increasing the offered load (via simulated users) until the qps >>> flatlines. >> >> If you want to increase QPS, you should not be increasing numShards. >> You need to increase replicationFactor. When your numShards matches the >> number of servers, every single server will be doing part of the work >> for every query. If you increase replicationFactor instead, then each >> server can be doing a different query in parallel. >> >> Sharding the index is what you need to do when you need to scale the >> size of the index, so each server does not get overwhelmed by dealing >> with every document for every query. >> >> Getting a high QPS with a big index requires increasing both numShards >> *AND* replicationFactor. >> >> Thanks, >> Shawn >> >>
smime.p7s
Description: S/MIME cryptographic signature