Thank you for the replies. The shard-per-user approach is interesting. We will look into it as well.
The errors we're getting when having ~1500 collections vary depending on the action (restarting the server, creating a new collection etc). The frequent ones are: 1. Connection refused when starting solr (happens when Solr fails to start): java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:806) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 453171 [main-SendThread(localhost.localdomain:2181)] WARN org.apache.zookeeper.ClientCnxn ? Session 0x14df5cd0f900008 for server null, unexpected error, closing socket connection and attempting reconnect 2. "Error getting leader" when starting Solr (happens when solr does start): :org.apache.solr.common.SolrException: Error getting leader from zk for shard shard1 at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:871) at org.apache.solr.cloud.ZkController.register(ZkController.java:783) at org.apache.solr.cloud.ZkController.register(ZkController.java:731) at org.apache.solr.core.ZkContainer$2.run(ZkContainer.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627) at java.lang.Thread.run(Thread.java:809) Caused by: org.apache.solr.common.SolrException: No registered leader was found after waiting for 1560000ms , collection: owner_234409 slice: shard1 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:531) at org.apache.solr.common.cloud.ZkStateReader.getLeaderUrl(ZkStateReader.java:505) at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:850) ... 6 more 3. Collection already exists (though it does not) when trying to create a collection 15/06/2015, 11:35:41 WARN OverseerCollectionProcessor OverseerCollectionProcessor.processMessage : createcollection ,? { 15/06/2015, 11:35:41 ERROR OverseerCollectionProcessor Collection createcollection of createcollection failed:org.apache.solr.common.SolrException: collection already exists: owner_484011 Collection createcollection of createcollection failed:org.apache.solr.common.SolrException: collection already exists: owner_484011 at org.apache.solr.cloud.OverseerCollectionProcessor.createCollection(OverseerCollectionProcessor.java:1545) at org.apache.solr.cloud.OverseerCollectionProcessor.processMessage(OverseerCollectionProcessor.java:385) at org.apache.solr.cloud.OverseerCollectionProcessor.run(OverseerCollectionProcessor.java:198) at java.lang.Thread.run(Thread.java:809) From: Erick Erickson <erickerick...@gmail.com> To: solr-user@lucene.apache.org Date: 14/06/2015 08:47 PM Subject: Re: Limitation on Collections Number re: hybrid approach. Hmmm, _assuming_ that no single user has a really huge number of documents you might be able to use a single collection (or much smaller group of collections), by using custom routing. That allows you to send all the docs for a particular user to a particular shard. There are some obvious issues here with the long-tail users, most of your users have +/- X docs on average, and three of them have 100,000X docs. There are probably some not-so-obvious gotcha's too.... True, for user X you'd send sub-requests to all shards, but all but one of them wouldn't find anything so would _probably_ be close to a no-op. Conceptually, each shard then becomes N of your current collections. Maybe there's a sweet spot performance-wise here where you're hosting some number of users per shard (or aggregate N docs per shard or...). Of course there's more maintenance here, particularly you have to manage the size of shards yourself since the possibility of them getting lopsided is higher etc. FWIW, Erick On Sun, Jun 14, 2015 at 9:48 AM, Shai Erera <ser...@gmail.com> wrote: >> >> My answer remains the same - a large number of collections (cores) in a >> single Solr instance is not one of the ways in which Solr is designed to >> scale. To repeat, there are only two ways to scale Solr, number of >> documents and number of nodes. >> > > Jack, I understand that, but I still feel you're missing the point. We > didn't ask about scaling Solr at all - it's a question about indexing > strategy when you need to index multiple disparate collections of documents > -- one collection w/ a collectionID field, or a Solr collection per set of > documents. > > If you are _not_ in SolrCloud, then there's the "Lots of cores" solution, >> see: http://wiki.apache.org/solr/LotsOfCores. Pay attention to the >> warning at the top: NOT FOR SOLRCLOUD! >> > > Thanks Erick. We did read this a while ago. We are in SolrCloud mode cause > we want to keep a replica per collection and SolrCloud makes it easy for > us. However, we aren't in a real/common SolrCloud mode, where we just need > to index 1B documents and sharding + replication comes to our aid. > > If we were not in a SolrCloud mode, I imagine we'd need to manage the > replicas ourselves and also index a document to both replicas manually? > That is, there is no way in _non_ SolrCloud mode to tell two cores that > they are replicas of one another - correct? > > A user may sign on and search her documents >> just a few times a day, for a few minutes at a time. >> > > This is almost true -- you may visit your Dropbox once an hour (or it may > be open in the background on your computer), but the server still receives > documents (e.g. shares) frequently by other users, and need to index it for > your collection. Not saying this isn't a good fit, just mentioning that > it's not only the user who can update his/her collection, and therefore > one's collection may be constantly active. Eventually this needs to be > benchmarked. > > Our benchmarks show that on 1000 such collections, we achieve significant > better response times from the multi-collection setup (one Solr collection > per user) vs the single-collection setup (one Solr collection for *all* > users, with a collectionID field added to all documents). Our next step is > to try perhaps a hybrid mode where we store groups of users in the same > Solr collection, but not all of them in the same Solr collection. So maybe > if Solr works well w/ 1000 collections, we will index 10 users in one such > collection ... we'll give it a try. > > I think SOLR-7191 may solve the general use case though I haven't yet read > through it thoroughly. > > Shai > > On Sun, Jun 14, 2015 at 6:50 PM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > >> Yes, there are some known problems while scaling to large number of >> collections, say 1000 or above. See >> https://issues.apache.org/jira/browse/SOLR-7191 >> >> On Sun, Jun 14, 2015 at 8:30 PM, Shai Erera <ser...@gmail.com> wrote: >> >> > Thanks Jack for your response. But I think Arnon's question was >> different. >> > >> > If you need to index 10,000 different collection of documents in Solr >> (say >> > a collection denotes someone's Dropbox files), then you have two options: >> > index all collections in one Solr collection, and add a field like >> > collectionID to each document and query, or index each user's private >> > collection in a different Solr collection. >> > >> > The pros of the latter is that you don't need to add a collectionID >> filter >> > to each query. Also from a security/privacy standpoint (and search >> quality) >> > - a user can only ever search what he has access to -- e.g. it cannot >> get a >> > spelling correction for words he never saw in his documents, nor document >> > suggestions (even though the 'context' in some of Lucene suggesters allow >> > one to do that too). From a quality standpoint you don't mix different >> term >> > statistics etc. >> > >> > So from a single node's point of view, you can either index 100M >> documents >> > in one index (Collection, shard, replica -- whatever -- a single Solr >> > core), or in 10,000 such cores. From node capacity perspectives the two >> are >> > the same -- same amount of documents will be indexed overall, same query >> > workload etc. >> > >> > So the question is purely about Solr and its collections management -- is >> > there anything in that process that can prevent one from managing >> thousands >> > of collections on a single node, or within a single SolrCloud instance? >> If >> > so, what is it -- are these the ZK watchers? Is there a thread per >> > collection at work? Others? >> > >> > Shai >> > >> > On Sun, Jun 14, 2015 at 5:21 PM, Jack Krupansky < >> jack.krupan...@gmail.com> >> > wrote: >> > >> > > As a general rule, there are only two ways that Solr scales to large >> > > numbers: large number of documents and moderate number of nodes (shards >> > and >> > > replicas). All other parameters should be kept relatively small, like >> > > dozens or low hundreds. Even shards and replicas should probably kept >> > down >> > > to that same guidance of dozens or low hundreds. >> > > >> > > Tens of millions of documents should be no problem. I recommend 100 >> > million >> > > as the rough limit of documents per node. Of course it all depends on >> > your >> > > particular data model and data and hardware and network, so that number >> > > could be smaller or larger. >> > > >> > > The main guidance has always been to simply do a proof of concept >> > > implementation to test for your particular data model and data values. >> > > >> > > -- Jack Krupansky >> > > >> > > On Sun, Jun 14, 2015 at 7:31 AM, Arnon Yogev <arn...@il.ibm.com> >> wrote: >> > > >> > > > We're running some tests on Solr and would like to have a deeper >> > > > understanding of its limitations. >> > > > >> > > > Specifically, We have tens of millions of documents (say 50M) and are >> > > > comparing several "#collections X #docs_per_collection" >> configurations. >> > > > For example, we could have a single collection with 50M docs or 5000 >> > > > collections with 10K docs each. >> > > > When trying to create the 5000 collections, we start getting frequent >> > > > errors after 1000-1500 collections have been created. Feels like some >> > > > limit has been reached. >> > > > These tests are done on a single node + an additional node for >> replica. >> > > > >> > > > Can someone elaborate on what could limit Solr to a high number of >> > > > collections (if at all)? >> > > > i.e. if we wanted to have 5K or 10K (or 100K) collections, is there >> > > > anything in Solr that can prevent it? Where would it break? >> > > > >> > > > Thanks, >> > > > Arnon >> > > >> > >> >> >> >> -- >> Regards, >> Shalin Shekhar Mangar. >>