Re: Where to use replicationFactor and maxShardsPerNode at SolrCloud?

Erick Erickson Mon, 22 Apr 2013 04:05:09 -0700

1) Imagine you have lots and lots and lots of different Solr indexes
and a 50 node cluster. Further imagine that one of those indexes has 2
shards, and a leader + shard is adequate to handle the load. You need
some way to limit the number of nodes your index gets distributed to,
that's what replicationFactor is for. So in this case
replicationFactor=2 will stop assigning nodes to that particular
collection after there's a leader + 1 replica

2> In the system you described, there won't be more than one
shard/node. But one strategy for growth is to "overshard". That is, in
the early days you put (numbers from thin air) 10 shards/node and they
are all quite small. As your index grows, you move to two nodes with 5
shards each. And later to 5 nodes with 2 shards and so on. There are
cases where you want some way to make the most of your hardware yet
plan for expansion.

Best
Erick

On Sun, Apr 21, 2013 at 3:51 PM, Furkan KAMACI <furkankam...@gmail.com> wrote:
> I know that: when using SolrCloud we define the number of shards into the
> system. When we start up new Solr instances each one will be a a leader for
> a shard, and if I continue to start up new Solr instances (that has
> exceeded the number number of shards) each one will be a replica for each
> leader as a round robin process.
>
> However when I read wiki there are two parameters: *replicationFactor *and *
> maxShardsPerNode.
>
> *1) Can you give details about what are they. If all newly added Solr
> instances becomes a replica what is that replication factor for?
> 2) If what I wrote is true about that round robin process what is that *
> maxShardsPerNode*? How can be more than one shard at the system I described?

Re: Where to use replicationFactor and maxShardsPerNode at SolrCloud?

Reply via email to