On 1/19/2021 4:19 PM, ufuk yılmaz wrote:
Lets say I had only 1 replica for each collection but I split it to 6 shards, 1
for every node.
Or I had 2 shards (1 shard is too big for a single node I think) but I had 3
replicas, 3x2=6, 1 on every node.
How would it affect the performance?
It all depends on how many queries you're expecting to occur at the same
time -- your query rate.
More replicas will generally make your system capable of handling a
higher query load than fewer replicas, as long as the replicas are
running on different physical hardware.
With a low query load, more shards CAN make things faster because it
throws more system capacity at the problem -- assuming the different
shards are on different physical hardware. But as the number of queries
increases, the systems get busier, and that advantage disappears.
Don't assign your heap size as a ratio of total memory size. Your heap
should be as big as it needs to be, and no bigger, leaving as much
memory as possible for disk caching. I can't say for sure, but with 20
indexes the size you're talking about, 50 GB of memory per node is
probably nowhere near enough.
Thanks,
Shawn