Re: Question: Solr perform well with thousands of replicas?
Hi Erick Thanks for your help. Before I visit wiki/maillist, I knew solr is unstable in 1000+ collections, and should be safe in 10~100 collections. But in a specific env, what's the exact number which solr begin to become unstable? I don't know. So I try to deploy a test cluster to get the number and try to optimize it bigger. (save my cost) That's my purpose: quantitative analysis --> How many replicas can be supported in my env? After get it, I will adjust my application: (when it's near the max number) prevent the creation of too many indexes or give a warning message to user. From: Erick Erickson Sent: Monday, September 2, 2019 21:20 To: solr-user@lucene.apache.org Subject: Re: Question: Solr perform well with thousands of replicas? > why so many collection/replica: it's our customer needs, for example: each > database table mappings a collection. I always cringe when I see statements like this. What this means is that your customer doesn’t understand search and needs guidance in the proper use of any search technology, Solr included. Solr is _not_ an RDBMS. Simply mapping the DB tables onto collections will almost certainly result in a poor experience. Next the customer will want to ask Solr to do the same thing a DB does, i.e. run a join across 10 tables etc., which will be abysmal. Solr isn’t designed for that. Some brilliant RDBMS people have spent many years making DBs to what they do and do it well. That said, RDBMSs have poor search capabilities, they aren’t built to solve the search problem. I suspect the time you spend making Solr load a thousand cores will be wasted. Once you do get them loaded, performance will be horrible. IMO you’d be far better off helping the customer define their problem so they properly model their search problem. This may mean that the result will be a hybrid where Solr is used for the free-text search and the RDBMS uses the results of the search to do something. Or vice versa. FWIW Erick > On Sep 2, 2019, at 5:55 AM, Hongxu Ma wrote: > > Thanks @Jörn and @Erick > I enlarged my JVM memory, so far it's stable (but used many memory). > And I will check lower-level errors according to your suggestion if error > happens. > > About my scenario: > > * why so many collection/replica: it's our customer needs, for example: > each database table mappings a collection. > * this env is just a test cluster: I want to verify the max collection > number solr can support stably. > > > > From: Erick Erickson > Sent: Friday, August 30, 2019 20:05 > To: solr-user@lucene.apache.org > Subject: Re: Question: Solr perform well with thousands of replicas? > > “no registered leader” is the effect of some problem usually, not the root > cause. In this case, for instance, you could be running out of file handles > and see other errors like “too many open files”. That’s just one example. > > One common problem is that Solr needs a lot of file handles and the system > defaults are too low. We usually recommend you start with 65K file handles > (ulimit) and bump up the number of processes to 65K too. > > So to throw some numbers out. With 1,000 replicas, and let’s say you have 50 > segments in the index in each replica. Each segment consists of multiple > files (I’m skipping “compound files” here as an advanced topic), so each > segment has, let’s say, 10 segments. 1,000 * 50 * 10 would require 500,000 > file handles on your system. > > Bottom line: look for other, lower-level errors in the log to try to > understand what limit you’re running into. > > All that said, there’ll be a number of “gotchas” when running that many > replicas on a particular node, I second Jörn;’s question... > > Best, > Erick > >> On Aug 30, 2019, at 3:18 AM, Jörn Franke wrote: >> >> What is the reason for this number of replicas? Solr should work fine, but >> maybe it is worth to consolidate some collections to avoid also >> administrative overhead. >> >>> Am 29.08.2019 um 05:27 schrieb Hongxu Ma : >>> >>> Hi >>> I have a solr-cloud cluster, but it's unstable when collection number is >>> big: 1000 replica/core per solr node. >>> >>> To solve this issue, I have read the performance guide: >>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems >>> >>> I noted there is a sentence on solr-cloud section: >>> "Recent Solr versions perform well with thousands of replicas." >>> >>> I want to know does it mean a single solr node can handle thousands of >>> replicas? or a solr cluster can (if so, what's the size of the cluster?) >>> >>> My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) >>> >>> Thanks for you help. >>> >
Re: Question: Solr perform well with thousands of replicas?
> why so many collection/replica: it's our customer needs, for example: each > database table mappings a collection. I always cringe when I see statements like this. What this means is that your customer doesn’t understand search and needs guidance in the proper use of any search technology, Solr included. Solr is _not_ an RDBMS. Simply mapping the DB tables onto collections will almost certainly result in a poor experience. Next the customer will want to ask Solr to do the same thing a DB does, i.e. run a join across 10 tables etc., which will be abysmal. Solr isn’t designed for that. Some brilliant RDBMS people have spent many years making DBs to what they do and do it well. That said, RDBMSs have poor search capabilities, they aren’t built to solve the search problem. I suspect the time you spend making Solr load a thousand cores will be wasted. Once you do get them loaded, performance will be horrible. IMO you’d be far better off helping the customer define their problem so they properly model their search problem. This may mean that the result will be a hybrid where Solr is used for the free-text search and the RDBMS uses the results of the search to do something. Or vice versa. FWIW Erick > On Sep 2, 2019, at 5:55 AM, Hongxu Ma wrote: > > Thanks @Jörn and @Erick > I enlarged my JVM memory, so far it's stable (but used many memory). > And I will check lower-level errors according to your suggestion if error > happens. > > About my scenario: > > * why so many collection/replica: it's our customer needs, for example: > each database table mappings a collection. > * this env is just a test cluster: I want to verify the max collection > number solr can support stably. > > > > From: Erick Erickson > Sent: Friday, August 30, 2019 20:05 > To: solr-user@lucene.apache.org > Subject: Re: Question: Solr perform well with thousands of replicas? > > “no registered leader” is the effect of some problem usually, not the root > cause. In this case, for instance, you could be running out of file handles > and see other errors like “too many open files”. That’s just one example. > > One common problem is that Solr needs a lot of file handles and the system > defaults are too low. We usually recommend you start with 65K file handles > (ulimit) and bump up the number of processes to 65K too. > > So to throw some numbers out. With 1,000 replicas, and let’s say you have 50 > segments in the index in each replica. Each segment consists of multiple > files (I’m skipping “compound files” here as an advanced topic), so each > segment has, let’s say, 10 segments. 1,000 * 50 * 10 would require 500,000 > file handles on your system. > > Bottom line: look for other, lower-level errors in the log to try to > understand what limit you’re running into. > > All that said, there’ll be a number of “gotchas” when running that many > replicas on a particular node, I second Jörn;’s question... > > Best, > Erick > >> On Aug 30, 2019, at 3:18 AM, Jörn Franke wrote: >> >> What is the reason for this number of replicas? Solr should work fine, but >> maybe it is worth to consolidate some collections to avoid also >> administrative overhead. >> >>> Am 29.08.2019 um 05:27 schrieb Hongxu Ma : >>> >>> Hi >>> I have a solr-cloud cluster, but it's unstable when collection number is >>> big: 1000 replica/core per solr node. >>> >>> To solve this issue, I have read the performance guide: >>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems >>> >>> I noted there is a sentence on solr-cloud section: >>> "Recent Solr versions perform well with thousands of replicas." >>> >>> I want to know does it mean a single solr node can handle thousands of >>> replicas? or a solr cluster can (if so, what's the size of the cluster?) >>> >>> My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) >>> >>> Thanks for you help. >>> >
Re: Question: Solr perform well with thousands of replicas?
Thanks @Jörn and @Erick I enlarged my JVM memory, so far it's stable (but used many memory). And I will check lower-level errors according to your suggestion if error happens. About my scenario: * why so many collection/replica: it's our customer needs, for example: each database table mappings a collection. * this env is just a test cluster: I want to verify the max collection number solr can support stably. From: Erick Erickson Sent: Friday, August 30, 2019 20:05 To: solr-user@lucene.apache.org Subject: Re: Question: Solr perform well with thousands of replicas? “no registered leader” is the effect of some problem usually, not the root cause. In this case, for instance, you could be running out of file handles and see other errors like “too many open files”. That’s just one example. One common problem is that Solr needs a lot of file handles and the system defaults are too low. We usually recommend you start with 65K file handles (ulimit) and bump up the number of processes to 65K too. So to throw some numbers out. With 1,000 replicas, and let’s say you have 50 segments in the index in each replica. Each segment consists of multiple files (I’m skipping “compound files” here as an advanced topic), so each segment has, let’s say, 10 segments. 1,000 * 50 * 10 would require 500,000 file handles on your system. Bottom line: look for other, lower-level errors in the log to try to understand what limit you’re running into. All that said, there’ll be a number of “gotchas” when running that many replicas on a particular node, I second Jörn;’s question... Best, Erick > On Aug 30, 2019, at 3:18 AM, Jörn Franke wrote: > > What is the reason for this number of replicas? Solr should work fine, but > maybe it is worth to consolidate some collections to avoid also > administrative overhead. > >> Am 29.08.2019 um 05:27 schrieb Hongxu Ma : >> >> Hi >> I have a solr-cloud cluster, but it's unstable when collection number is >> big: 1000 replica/core per solr node. >> >> To solve this issue, I have read the performance guide: >> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems >> >> I noted there is a sentence on solr-cloud section: >> "Recent Solr versions perform well with thousands of replicas." >> >> I want to know does it mean a single solr node can handle thousands of >> replicas? or a solr cluster can (if so, what's the size of the cluster?) >> >> My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) >> >> Thanks for you help. >>
Re: Question: Solr perform well with thousands of replicas?
“no registered leader” is the effect of some problem usually, not the root cause. In this case, for instance, you could be running out of file handles and see other errors like “too many open files”. That’s just one example. One common problem is that Solr needs a lot of file handles and the system defaults are too low. We usually recommend you start with 65K file handles (ulimit) and bump up the number of processes to 65K too. So to throw some numbers out. With 1,000 replicas, and let’s say you have 50 segments in the index in each replica. Each segment consists of multiple files (I’m skipping “compound files” here as an advanced topic), so each segment has, let’s say, 10 segments. 1,000 * 50 * 10 would require 500,000 file handles on your system. Bottom line: look for other, lower-level errors in the log to try to understand what limit you’re running into. All that said, there’ll be a number of “gotchas” when running that many replicas on a particular node, I second Jörn;’s question... Best, Erick > On Aug 30, 2019, at 3:18 AM, Jörn Franke wrote: > > What is the reason for this number of replicas? Solr should work fine, but > maybe it is worth to consolidate some collections to avoid also > administrative overhead. > >> Am 29.08.2019 um 05:27 schrieb Hongxu Ma : >> >> Hi >> I have a solr-cloud cluster, but it's unstable when collection number is >> big: 1000 replica/core per solr node. >> >> To solve this issue, I have read the performance guide: >> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems >> >> I noted there is a sentence on solr-cloud section: >> "Recent Solr versions perform well with thousands of replicas." >> >> I want to know does it mean a single solr node can handle thousands of >> replicas? or a solr cluster can (if so, what's the size of the cluster?) >> >> My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) >> >> Thanks for you help. >>
Re: Question: Solr perform well with thousands of replicas?
What is the reason for this number of replicas? Solr should work fine, but maybe it is worth to consolidate some collections to avoid also administrative overhead. > Am 29.08.2019 um 05:27 schrieb Hongxu Ma : > > Hi > I have a solr-cloud cluster, but it's unstable when collection number is big: > 1000 replica/core per solr node. > > To solve this issue, I have read the performance guide: > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems > > I noted there is a sentence on solr-cloud section: > "Recent Solr versions perform well with thousands of replicas." > > I want to know does it mean a single solr node can handle thousands of > replicas? or a solr cluster can (if so, what's the size of the cluster?) > > My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) > > Thanks for you help. >
Re: Question: Solr perform well with thousands of replicas?
Hi guys Thanks for your helpful help! More details about my env. Cluster: A 4 GCP(google cloud) hosts cluster, each host: 16Core cpu, 60G mem, 2TB HDD. I set up 2 solr nodes on each host and there are 1000+ replicas on each solr node. (Sorry for forgetting this before: 2 solr node on each host, so there are 2000+ replicas on each host...) zookeeper has 3 instances, reuse the solr hosts (using a separated disk). Workload: just index tens of millions of record (total size near 100GB) into dozens (near 100) of indexes, 30 concurrent, no search operation at the same time (I will do search test later). Error: "unstable" means there are many solr errors in log and the solr request is failed. e.g. "No registered leader was found after waiting for 4000ms , collection ..." @ Hendrik after saw your reply, I noted my replicas num is too big, so I adjusted to: 720 replicas on each host (reduced shard num), then all my index requests are successful. (happy) but I saw the JVM peak mem usage is 24GB (via solr web UI), it's too big to be risky in the future (my JMV xmx is 32GB). so would you give me some guides to reduce the memory usage? (like you mentioned "tuned a few caches down to a minimum") @ Erick I gave details above, please check. @ Shawn thanks for your info, it's a bad news... hope solr-cloud can handle more collections in future. From: Shawn Heisey Sent: Thursday, August 29, 2019 21:58 To: solr-user@lucene.apache.org Subject: Re: Question: Solr perform well with thousands of replicas? On 8/28/2019 9:27 PM, Hongxu Ma wrote: > I have a solr-cloud cluster, but it's unstable when collection number is big: > 1000 replica/core per solr node. > > To solve this issue, I have read the performance guide: > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems > > I noted there is a sentence on solr-cloud section: > "Recent Solr versions perform well with thousands of replicas." The SolrPerformanceProblems wiki page is my work. I only wrote that sentence because other devs working in SolrCloud code told me that was the case. Based on things said by people (including your comments on this thread), I think newer versions probably aren't any better, and that sentence needs to be removed from the wiki page. See this issue that I created a few years ago: https://issues.apache.org/jira/browse/SOLR-7191 This issue was closed with a 6.3 fix version ... but nothing was committed with a tag for the issue, so I have no idea why it was closed. I think the problems described there are still there in recent Solr versions, and MIGHT be even worse than they were in 4.x and 5.x. > I want to know does it mean a single solr node can handle thousands of > replicas? or a solr cluster can (if so, what's the size of the cluster?) A single standalone Solr instance can handle lots of indexes, but Solr startup is probably going to be slow. No matter how many nodes there are, SolrCloud has problems with thousands of collections or replicas due to issues with the overseer queue getting enormous. When I created SOLR-7191, I found that restarting a node in a cloud with thousands of replicas (cores) can result in a performance death spiral. I haven't ever administered a production setup with thousands of indexes, I've only done some single machine testing for the issue I created. I need to repeat it with 8.x and see what happens. But I have very little free time these days. Thanks, Shawn
Re: Question: Solr perform well with thousands of replicas?
On 8/28/2019 9:27 PM, Hongxu Ma wrote: I have a solr-cloud cluster, but it's unstable when collection number is big: 1000 replica/core per solr node. To solve this issue, I have read the performance guide: https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems I noted there is a sentence on solr-cloud section: "Recent Solr versions perform well with thousands of replicas." The SolrPerformanceProblems wiki page is my work. I only wrote that sentence because other devs working in SolrCloud code told me that was the case. Based on things said by people (including your comments on this thread), I think newer versions probably aren't any better, and that sentence needs to be removed from the wiki page. See this issue that I created a few years ago: https://issues.apache.org/jira/browse/SOLR-7191 This issue was closed with a 6.3 fix version ... but nothing was committed with a tag for the issue, so I have no idea why it was closed. I think the problems described there are still there in recent Solr versions, and MIGHT be even worse than they were in 4.x and 5.x. I want to know does it mean a single solr node can handle thousands of replicas? or a solr cluster can (if so, what's the size of the cluster?) A single standalone Solr instance can handle lots of indexes, but Solr startup is probably going to be slow. No matter how many nodes there are, SolrCloud has problems with thousands of collections or replicas due to issues with the overseer queue getting enormous. When I created SOLR-7191, I found that restarting a node in a cloud with thousands of replicas (cores) can result in a performance death spiral. I haven't ever administered a production setup with thousands of indexes, I've only done some single machine testing for the issue I created. I need to repeat it with 8.x and see what happens. But I have very little free time these days. Thanks, Shawn
Re: Question: Solr perform well with thousands of replicas?
There are two factors: 1> the raw number of replicas on a Solr node. 2> total resources Solr needs. You say “..it’s unstalble…”. _How_ is it unstable? What symptoms are you seeing? You might want to review: https://cwiki.apache.org/confluence/display/solr/UsingMailingLists And not as you add more cores, you put more pressure on memory, I/O, etc. So whether it’s the raw number of cores or you’re just exhausting memory, overloading your CPU, etc. is hard to say without more information. Best, Erick > On Aug 29, 2019, at 1:31 AM, Hendrik Haddorp wrote: > > Hi, > > we are usually using Solr Clouds with 5 nodes and up to 2000 collections > and a replication factor of 2. So we have close to 1000 cores per node. > That is on Solr 7.6 but I believe 7.3 worked as well. We tuned a few > caches down to a minimum as otherwise the memory usage goes up a lot. > The Solr UI is having some problems with a high number of collections, > like lots of timeouts when loading the status. > > Older Solr versions had problem with the overseer queue in ZooKeeper. If > you restarted too many nodes at once then the queue got too long and > Solr died and required some help and cleanup to start at all again. > > regards, > Hendrik > > On 29.08.19 05:27, Hongxu Ma wrote: >> Hi >> I have a solr-cloud cluster, but it's unstable when collection number is >> big: 1000 replica/core per solr node. >> >> To solve this issue, I have read the performance guide: >> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems >> >> I noted there is a sentence on solr-cloud section: >> "Recent Solr versions perform well with thousands of replicas." >> >> I want to know does it mean a single solr node can handle thousands of >> replicas? or a solr cluster can (if so, what's the size of the cluster?) >> >> My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) >> >> Thanks for you help. >> >> >
Re: Question: Solr perform well with thousands of replicas?
Hi, we are usually using Solr Clouds with 5 nodes and up to 2000 collections and a replication factor of 2. So we have close to 1000 cores per node. That is on Solr 7.6 but I believe 7.3 worked as well. We tuned a few caches down to a minimum as otherwise the memory usage goes up a lot. The Solr UI is having some problems with a high number of collections, like lots of timeouts when loading the status. Older Solr versions had problem with the overseer queue in ZooKeeper. If you restarted too many nodes at once then the queue got too long and Solr died and required some help and cleanup to start at all again. regards, Hendrik On 29.08.19 05:27, Hongxu Ma wrote: Hi I have a solr-cloud cluster, but it's unstable when collection number is big: 1000 replica/core per solr node. To solve this issue, I have read the performance guide: https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems I noted there is a sentence on solr-cloud section: "Recent Solr versions perform well with thousands of replicas." I want to know does it mean a single solr node can handle thousands of replicas? or a solr cluster can (if so, what's the size of the cluster?) My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) Thanks for you help.
Question: Solr perform well with thousands of replicas?
Hi I have a solr-cloud cluster, but it's unstable when collection number is big: 1000 replica/core per solr node. To solve this issue, I have read the performance guide: https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems I noted there is a sentence on solr-cloud section: "Recent Solr versions perform well with thousands of replicas." I want to know does it mean a single solr node can handle thousands of replicas? or a solr cluster can (if so, what's the size of the cluster?) My solr version is 7.3.1 and 6.6.2 (looks they are the same in performance) Thanks for you help.