Re: Limitation on Collections Number

2015-06-15 Thread Arnon Yogev
) at org.apache.solr.cloud.OverseerCollectionProcessor.run(OverseerCollectionProcessor.java:198) at java.lang.Thread.run(Thread.java:809) From: Erick Erickson erickerick...@gmail.com To: solr-user@lucene.apache.org Date: 14/06/2015 08:47 PM Subject:Re: Limitation on Collections Number

Re: Limitation on Collections Number

2015-06-14 Thread Jack Krupansky
As a general rule, there are only two ways that Solr scales to large numbers: large number of documents and moderate number of nodes (shards and replicas). All other parameters should be kept relatively small, like dozens or low hundreds. Even shards and replicas should probably kept down to that

Limitation on Collections Number

2015-06-14 Thread Arnon Yogev
We're running some tests on Solr and would like to have a deeper understanding of its limitations. Specifically, We have tens of millions of documents (say 50M) and are comparing several #collections X #docs_per_collection configurations. For example, we could have a single collection with 50M

Re: Limitation on Collections Number

2015-06-14 Thread Shai Erera
My answer remains the same - a large number of collections (cores) in a single Solr instance is not one of the ways in which Solr is designed to scale. To repeat, there are only two ways to scale Solr, number of documents and number of nodes. Jack, I understand that, but I still feel you're

Re: Limitation on Collections Number

2015-06-14 Thread Erick Erickson
To my knowledge there's nothing built in to Solr to limit the number of collections. There's nothing explicitly in place to handle many hundreds of collections either so you're really in uncharted, certainly untested waters. Anecdotally we've heard of the problem you're describing. You say you

Re: Limitation on Collections Number

2015-06-14 Thread Erick Erickson
re: hybrid approach. Hmmm, _assuming_ that no single user has a really huge number of documents you might be able to use a single collection (or much smaller group of collections), by using custom routing. That allows you to send all the docs for a particular user to a particular shard. There are

Re: Limitation on Collections Number

2015-06-14 Thread Shai Erera
Thanks Jack for your response. But I think Arnon's question was different. If you need to index 10,000 different collection of documents in Solr (say a collection denotes someone's Dropbox files), then you have two options: index all collections in one Solr collection, and add a field like

Re: Limitation on Collections Number

2015-06-14 Thread Jack Krupansky
My answer remains the same - a large number of collections (cores) in a single Solr instance is not one of the ways in which Solr is designed to scale. To repeat, there are only two ways to scale Solr, number of documents and number of nodes. -- Jack Krupansky On Sun, Jun 14, 2015 at 11:00 AM,

Re: Limitation on Collections Number

2015-06-14 Thread Shalin Shekhar Mangar
Yes, there are some known problems while scaling to large number of collections, say 1000 or above. See https://issues.apache.org/jira/browse/SOLR-7191 On Sun, Jun 14, 2015 at 8:30 PM, Shai Erera ser...@gmail.com wrote: Thanks Jack for your response. But I think Arnon's question was different.