First off thanks everyone for the very useful replies thus far. Shawn - thanks for the list of items to check. #1 and #2 should be fine for us and I'll check our ulimit for #3.
To add a bit of clarification, we are indeed using SolrCloud. Our current setup is to create a new collection for each customer. For now we allow SolrCloud to decide for itself where to locate the initial shard(s) but in time we expect to refine this such that our system will automatically choose the least loaded nodes according to some metric(s). Having more than one business entity controlling the configuration of a > single (Solr) server is a recipe for disaster. Solr works well if there is > an architect for the system. Jack, can you explain a bit what you mean here? It looks like Toke caught your meaning but I'm afraid it missed me. What do you mean by "business entity"? Is your concern that with automatic creation of collections they will be distributed willy-nilly across the cluster, leading to uneven load across nodes? If it is relevant, the schema and solrconfig are controlled entirely by me and is the same for all collections. Thus theoretically we could actually just use one single collection for all of our customers (adding a 'customer:<whatever>' type fq to all queries) but since we never need to query across customers it seemed more performant (as well as safer - less chance of accidentally leaking data across customers) to use separate collections. Better to give each tenant a separate Solr instance that you spin up and > spin down based on demand. Regarding this, if by tenant you mean "customer", this is not viable for us from a cost perspective. As I mentioned initially, many of our customers are very small so dedicating an entire machine to each of them would not be economical (or efficient). Or perhaps I am not understanding what your definition of "tenant" is? Cheers, Ian On Tue, Mar 24, 2015 at 4:51 PM, Toke Eskildsen <t...@statsbiblioteket.dk> wrote: > Jack Krupansky [jack.krupan...@gmail.com] wrote: > > I'm sure that I am quite unqualified to describe his hypothetical setup. > I > > mean, he's the one using the term multi-tenancy, so it's for him to be > > clear. > > It was my understanding that Ian used them interchangeably, but of course > Ian it the only one that knows. > > > For me, it's a question of who has control over the config and schema and > > collection creation. Having more than one business entity controlling the > > configuration of a single (Solr) server is a recipe for disaster. > > Thank you. Now your post makes a lot more sense. I will not argue against > that. > > - Toke Eskildsen >