On 8/26/2018 12:00 AM, Wei wrote:
I have a question about the deployment configuration in solr cloud. When
we need to increase the number of shards in solr cloud, there are two
options:
1. Run multiple solr instances per host, each with a different port and
hosting a single core for one shard.
2. Run one solr instance per host, and have multiple cores(shards) in the
same solr instance.
Which would be better performance wise? For the first option I think JVM
size for each solr instance can be smaller, but deployment is more
complicated? Are there any differences for cpu utilization?
My general advice is to only have one Solr instance per machine. One
Solr instance can handle many indexes, and usually will do so with less
overhead than two or more instances.
I can think of *ONE* exception to this -- when a single Solr instance
would require a heap that's extremely large. Splitting that into two or
more instances MIGHT greatly reduce garbage collection pauses. But
there's a caveat to the caveat -- in my strong opinion, if your Solr
instance is so big that it requires a huge heap and you're considering
splitting into multiple Solr instances on one machine, you very likely
need to run each of those instances on *separate* machines, so that each
one can have access to all the resources of the machine it's running on.
For SolrCloud, when you're running multiple instances per machine, Solr
will consider those to be completely separate instances, and you may end
up with all of the replicas for a shard on a single machine, which is a
problem for high availability.
Thanks,
Shawn