On 8/28/2019 9:27 PM, Hongxu Ma wrote:
I have a solr-cloud cluster, but it's unstable when collection number is big: 
1000 replica/core per solr node.

To solve this issue, I have read the performance guide:
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems

I noted there is a sentence on solr-cloud section:
"Recent Solr versions perform well with thousands of replicas."

The SolrPerformanceProblems wiki page is my work. I only wrote that sentence because other devs working in SolrCloud code told me that was the case. Based on things said by people (including your comments on this thread), I think newer versions probably aren't any better, and that sentence needs to be removed from the wiki page.

See this issue that I created a few years ago:

https://issues.apache.org/jira/browse/SOLR-7191

This issue was closed with a 6.3 fix version ... but nothing was committed with a tag for the issue, so I have no idea why it was closed. I think the problems described there are still there in recent Solr versions, and MIGHT be even worse than they were in 4.x and 5.x.

I want to know does it mean a single solr node can handle thousands of 
replicas? or a solr cluster can (if so, what's the size of the cluster?)

A single standalone Solr instance can handle lots of indexes, but Solr startup is probably going to be slow.

No matter how many nodes there are, SolrCloud has problems with thousands of collections or replicas due to issues with the overseer queue getting enormous. When I created SOLR-7191, I found that restarting a node in a cloud with thousands of replicas (cores) can result in a performance death spiral.

I haven't ever administered a production setup with thousands of indexes, I've only done some single machine testing for the issue I created. I need to repeat it with 8.x and see what happens. But I have very little free time these days.

Thanks,
Shawn

Reply via email to