In my use case I have indexed a union catalog for some hundred libraries, where each library can have a search service, plus adding their own catalog data they do not want to share.
Elasticsearch offers far more flexibility and performance than Solr with the ability of automatic extending the cluster by adding nodes (without configuration change) combined with automatic rebalancing of shards, plus the feature of index aliases and shard over-allocation, an explanation is here: http://elasticsearch-users.115913.n3.nabble.com/Over-allocation-of-shards-td3673978.html With index aliases, I do not have to perform evil things like shard splitting. No index copy required, no full re-index. That is, I can organize some library catalog index over the machines, and address an "index view" for each library by assigning several index aliases (e.g. collection names or library identifiers) to the library catalog segments they are interested in, with term filters. Index updates come from a single point of a primary data base plus data packages the libraries can upload. If the number of input data exceeds the capacity, I can simply start a new node, without touching the configuration. Also, releasing new index versions is a snap with Elasticsearch. The index names carry timestamp information (e.g. ddMMyyHH) and it is easy to organize index versions like rolling windows, with the latest index being the current one to search. Old indices are dropped if the are no longer needed. Jörg On Mon, Oct 13, 2014 at 8:12 PM, Ian Rose <ianr...@fullstory.com> wrote: > Hi - > > My team has used Solr in it's single-node configuration (without > SolrCloud) for a few years now. In our current product we are now looking > at transitioning to SolrCloud, but before we made that leap I wanted to > also take a good look at whether ElasticSearch would be a better fit for > our needs. Although ES has some nice advantages (such as automatic shard > rebalancing) I'm trying to figure out how to live in a world without shard > splitting. In brief, our situation is as follows: > > - We use one index ("collection" in Solr) per customer. > - The indexes are going to vary quite a bit in size, following something > like a power-law distribution with many small indexes (let's guess < 250k > documents), some medium sized indexes (up to a few million documents) and a > few large indexes (hundreds of millions of documents). > - So the number of shards required per index will vary greatly, and will > be hard to predict accurately at creation time. > > How do people generally approach this kind of problem? Do you just make a > best guess at the appropriate number of shards for each new index and then > do a full re-index (with more shards) if the number of documents grows > bigger than expected? > > Thanks! > - Ian > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/ded96e32-e1f1-4d09-8356-7367c86b1166%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/ded96e32-e1f1-4d09-8356-7367c86b1166%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHWv1bNZ571cu64VArC-H9cZ60snV8qRuPcj4JCqsVrBw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.