This is a super timely blog from the Found crew - https://found.no/foundation/multi-tenancy/
On 17 March 2015 at 14:11, Mark Walkom <markwal...@gmail.com> wrote: > There are practical limits, based on your dataset, node sizing, version > etc. > > You'd be better off segregating indices by a higher level definition (eg > customer number, 1-999, 1000-1999 etc), using routing and then aliases on > top. This way you conceptually get the same layout as a single index per > customer, but it gives you the option to split larger customers out to > their own index and without wasting resources on small use customers. > > On 16 March 2015 at 19:11, Richard Blaylock <rich...@stormpath.com> wrote: > >> Hi all, >> >> We have a multi-tenant product and are leaning towards dynamically >> creating (and deleting) various indexes relevant to a tenant at runtime: as >> a tenant is created, so are that tenant's indexes. When a tenant is >> deleted so are that tenant's indexes. Each index is specific to that >> tenant and could vary in size, but we do not expect any given index to ever >> be larger than a single disk (e.g. 80 GB). >> >> Due to index shard issues (static, too many shards per index = a hit on >> performance (more map/reduce work to do), etc.), and due to the nature of >> our application, we are currently opting for a single-shard-per-index model >> - each index will have one and only one shard. We will have replicas for >> fault tolerance. >> >> On the surface, this appears to be an ideal design choice for >> multi-tenant applications: for any given index, one and only one shard will >> be 'hit' - no need to search across multiple shards, ever. It also reduces >> contention because indexes are always tenant-specific: as an index becomes >> larger, any slowness due to the large index *only* impacts the >> corresponding tenant (customer), whereas the alternative - using one index >> across tenants - one tenant's use/load could negatively impact other >> tenants' query performance. >> >> So for multi-tenancy, this single-shard-per-index model sounds ideal for >> our use case - the *only* issue here is that the number of indexes >> increases dramatically as the number of tenants (customers) increases. >> Consider a system with 20,000 tenants, each having (potentially) hundreds >> or thousands, or even 10s of thousands of indexes, resulting in millions of >> indexes overall. This is manageable from our product's perspective, but >> what impact would this have on ElasticSearch, if any? >> >> Are there practical limits? IIUC, there is a Lucene index (file) per >> shard, so if there are hundreds of thousands or millions of Lucene >> indexes/files - other than disk space and file descriptor count per ES >> node, are there any other limits? Does performance degrade as the number >> of single-shard-indexes increases? Or is there no problem at all? >> >> Thanks, >> Richard >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearch+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/607f62c1-5854-43e0-9d25-3f748aca44a4%40googlegroups.com >> <https://groups.google.com/d/msgid/elasticsearch/607f62c1-5854-43e0-9d25-3f748aca44a4%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X99jYR7a%2BYuf3o-C_bxE5OvxybTAKr2rQL4HEEDqS0R6Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.