Hmmm, that page is quite a bit out of date. I think Shawn is talking
about the "old style" Solr (4.x) that put all the state information
for all the collections in a single znode "clusterstate.json". Newer
style Solr puts each collection's state in
/collections/my_collection/state.json which has very significantly
reduced this issue.

There are still some issues in the 5x code line where you can have a
ton of messages be processed by the "Overseer" at massive scales...

However, I know of installations with several 100s of K (yes hundreds
of thousands) of replicas out there, split up amongst a _lot_ of
collections. That takes quite a bit of care and feeding, mind you.

So your setup shouldn't be a problem, although I'd bring up my Solr
instances one at a time.

Whether ZK is embedded or not isn't really a problem, but I would very
seriously consider moving it to an external ensemble. It's not so much
a functional issue as administrative. You have to be careful to bring
your Solr nodes up and down carefully or you lose quorum.

Best,
Erick

On Tue, Oct 10, 2017 at 7:37 AM, Jon Drews <j...@jondrews.com> wrote:
> In the Solr Wiki, Shawn Heisey writes the following:
>
> "Regardless of the number of nodes or available resources, SolrCloud begins
> to have stability problems when the number of collections reaches the low
> hundreds. With thousands of collections, any little problem or change to
> the cluster can cause a stability death spiral that may not recover for
> tens of minutes. Try to keep the number of collections as low as possible.
> These problems are due to how SolrCloud updates cluster state in zookeeper
> in response to cluster changes. Work is underway to try and improve this
> situation."
> https://wiki.apache.org/solr/SolrPerformanceProblems?
> action=diff&rev1=45&rev2=46
>
> I'd like to know if this would apply to a standalone Solr system (embedded
> Zk) with one collection and the low hundreds of shards (e.g. 1 node, 1
> collection and 200 shards).
>
> If there's any JIRA tickets we should track regarding the work that's
> underway to resolve this situation please provide them. Thanks!
>
> We are currently using 5.3.1

Reply via email to