Re: Upper bounds on the number of indexes in an elastic search cluster

Todd Nine Fri, 26 Sep 2014 17:09:12 -0700

It sounds like we're going to need to test our upper bounds of indexes
(with no data) to see how many we can support.  We may need to re-evaluate
our thoughts on an index per app.  We might be better off doing a
statically sized set of indexes, then consistently hashing our applications
to those indexes.  Thank you for your help!


On Fri, Sep 26, 2014 at 5:44 PM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:

> If you consider tens of thousands of indices on tens of thousands of
> nodes, and the master node is the only node that can write to the cluster
> state, it will have lot of work to do to keep up with all cluster state
> updates.
>
> When the rate of changes to the cluster state increases, the master node
> will be challenged to propagate the state changes to all other nodes
> reliably and fast enough. There is no "distributed tree" yet, for example,
> there are no special "forwarder nodes" that communicate the cluster state
> to a partitioned set of nodes.
>
> See https://github.com/elasticsearch/elasticsearch/issues/6186
>
> The cluster state is a big compressed JSON structure which also must fit
> into the heap memory of master eligible nodes. ES also uses privileged
> network traffic channels for cluster state communication to take precedence
> over ordinary index/search messaging. But all these precautions may not be
> enough at some point. You can observe the point by retrieving a growing
> cluster state over the cluster state API and measure the size and time of
> this request.
>
> On the other hand, you can have calm cluster state and many thousand
> indices when the type mappings are constant, no field updates occur, and no
> nodes connect/disconnect. It all depends on the situation how you have to
> operate the cluster. One important thing is to allocate enough resources to
> master eligible data-less nodes so they are not hindered by extra
> search/index load.
>
> N.B. 20 nodes is not a big cluster. There are ES clusters of hundreds and
>  thousands of nodes. From my understanding, the communication of the master
> and 20 nodes is not a serious problem. This becomes an issue at ~500-1000
> nodes.
>
> Jörg
>
>
>
>
> On Sat, Sep 27, 2014 at 1:12 AM, Todd Nine <tn...@apigee.com> wrote:
>
>> Hi Jorg,
>>   We're storing each application in it's own Index so we can manage it
>> independently of others.  There's not set load or usage on our
>> applications.  Some will be very small, a few hundred documents.  Others
>> will be quite large, in the billions.  We have no way of knowing what the
>> usage profile is.  Rather, we're initially thinking that expansion will
>> occur using a combination of additional indexes and aliases referencing
>> those indexes.  This allows us to automate the management of the aliases
>> and indexes, and in turn allows us to scale them to the needs of the
>> application without over allocating unused capacity.   For instance, with
>> write heavy applications we can allocate more shards (via alias), for read
>> heavy application we can allocate more replicas.
>>
>>
>>
>> We're not running our cluster on a single node.  Our cluster is small to
>> begin with, it's 6 nodes in our current POC.  Ultimately I expect us to
>> grow each cluster we stand up to 20 or so nodes.  We'll expand as necessary
>> to support the number of shards and replicas and keep our performance up.
>> I'm not particularly worried about our ability to scale horizontally with
>> our hardware.
>>
>> Rather, I'm concerned with how far can we scale on our number of indexes,
>> and how does that relate to the number of machines?  When we keep adding
>> hardware, does this increase the upper bounds of the number of indexes we
>> can have?  Not the physical shards and replicas, but the routing
>> information for the master of the shards and location of the replicas.
>> I've done distributed data storage for many years, and none of the
>> documentation on ES makes it clear if this becomes an issue operationally.
>> I'm leery to just assume it will "just work".  When implementing something
>> like this, you either have to do a distributed tree for your meta data to
>> get the partitioning you need to scale infinitely, or every node must store
>> every shard's master information.   How does it work in ES?
>>
>>
>> Thanks,
>> Todd
>>
>>
>>
>> On Friday, September 26, 2014 4:29:53 PM UTC-6, Jörg Prante wrote:
>>>
>>> Why do you want to create huge number of indexes on just a single node?
>>>
>>> There are smarter methods to scale. Use over-allocation of shards. This
>>> is explained by kimchy in this thread
>>>
>>> http://elasticsearch-users.115913.n3.nabble.com/Over-
>>> allocation-of-shards-td3673978.html
>>> <http://www.google.com/url?q=http%3A%2F%2Felasticsearch-users.115913.n3.nabble.com%2FOver-allocation-of-shards-td3673978.html&sa=D&sntz=1&usg=AFQjCNEk7KTtpuEot3JtBBmMRMpH25vLDA>
>>>
>>> TL;DR you can create many thousands of aliases on a single (or few)
>>> indices with just a few shards. There is no limit defined by ES, when your
>>> configuration / hardware capacity is exceeded, you will see the node
>>> getting sluggish.
>>>
>>> Jörg
>>>
>>> On Fri, Sep 26, 2014 at 11:23 PM, Todd Nine <tn...@apigee.com> wrote:
>>>
>>>> Hey guys.  We’re building a Multi tenant application, where users
>>>> create applications within our single server.    For our current ES scheme,
>>>> we're building an index per application.  Are there any stress tests or
>>>> documentation on the upper bounds of the number of indexes a cluster can
>>>> handle?  From my current understanding of meta data and routing, ever node
>>>> caches the meta data of all the indexes and shards for routing.  At some
>>>> point, this will obviously overwhelm the node.  Is my current understanding
>>>> correct, or is this information partitioned across the cluster as well?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Todd
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to elasticsearc...@googlegroups.com.
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%
>>>> 40googlegroups.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/200e1ce7-c56f-49d4-9c02-4b1dcc570bf2%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/595db4ab-b2f3-4dfe-bbf0-e4c13926e75e%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/595db4ab-b2f3-4dfe-bbf0-e4c13926e75e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/9HzLNik4D-o/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGULM1L_6s%3Dk7SAftA%3D0TM3Jx1haYddnAOLAuHa-4z%2BOw%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGULM1L_6s%3Dk7SAftA%3D0TM3Jx1haYddnAOLAuHa-4z%2BOw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CA%2Byzqf8nH_FkR%3Dt0DA8NQf51AMUKLoo%2BYp3Ju3dO8ceMeKvU0w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Upper bounds on the number of indexes in an elastic search cluster

Reply via email to