Re: Shard Initialization slow down

2014-05-13 Thread Paul
Thanks Jörg, we've heard of others pre-creating indices, we were seeing it 
as a work around rather than a regular action but what you say makes it 
seem like something we should work with.


On Tuesday, May 13, 2014 12:13:10 PM UTC+1, Jörg Prante wrote:
>
> You should create indexes before bulk indexing. First, bulk indexing works 
> much better if all indices and their mappings are already present, the 
> operations will run faster and without conflicts, and the cluster state 
> updates are less frequent which reduces some noise and hiccups. Second, 
> setting the indices refresh rate to -1 and replica level to 0 while in bulk 
> indexing mode helps a lot for performance.
>
> If you create 1000+ shards per node, you seem to exceed the limit of your 
> system. Do not expect admin operations like index creation work in O(1) 
> time, they are O(n/c)  with n = number of affected shards and c the 
> threadpool size for the operation (the total node number also counts but I 
> neglect it here). So yes, it is expected that index creation operations 
> take longer if they reach the limit of your nodes, but there can be plenty 
> of reasons for it (increasing shard count is just one of them). And it is 
> expected that you see the 30s cluster action timeout in theses cases, yes.
>
> There is no strictly predictable resource limit for a node, all this 
> depends heavily on factors from outside of Elasticsearch (JVM, CPU, memory, 
> disk I/O, your workload of indexing/searching) so it is up to you to 
> calibrate your node capacity. After adding nodes, you will observe that ES 
> scales well and can handle more shards.
>
> Jörg
>
>
> On Tue, May 13, 2014 at 11:59 AM, Paul >wrote:
>
>> We are seeing a slow down in shard initialization speed as the number of 
>> shards/indices grows in our cluster.
>>
>> With 0-100's of indices/shards existing in the cluster a new bulk 
>> creation of indices up the 100's at a time is fine, we see them pass 
>> through the states and get a green cluster in a reasonable amount of time.
>>
>> As the total cluster size grows to 1000+ indices (3000+ shards) we begin 
>> to notice that the first rounds of initialization take longer to process, 
>> it seems to speed up after the first few batches, but this slow down leads 
>> to "failed to process cluster event (create-index [index_1112], cause 
>> [auto(bulk api)]) within 30s" type messages in the Master logs - the 
>> indices are eventually created.
>>
>>
>> Has anyone else experienced this? (did you find the cause / way to fix?)
>>
>> Is this somewhat expected behaviour? - are we approaching something 
>> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c918772-cd05-4640-aa67-3924737b3342%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shard Initialization slow down

2014-05-13 Thread joergpra...@gmail.com
You should create indexes before bulk indexing. First, bulk indexing works
much better if all indices and their mappings are already present, the
operations will run faster and without conflicts, and the cluster state
updates are less frequent which reduces some noise and hiccups. Second,
setting the indices refresh rate to -1 and replica level to 0 while in bulk
indexing mode helps a lot for performance.

If you create 1000+ shards per node, you seem to exceed the limit of your
system. Do not expect admin operations like index creation work in O(1)
time, they are O(n/c)  with n = number of affected shards and c the
threadpool size for the operation (the total node number also counts but I
neglect it here). So yes, it is expected that index creation operations
take longer if they reach the limit of your nodes, but there can be plenty
of reasons for it (increasing shard count is just one of them). And it is
expected that you see the 30s cluster action timeout in theses cases, yes.

There is no strictly predictable resource limit for a node, all this
depends heavily on factors from outside of Elasticsearch (JVM, CPU, memory,
disk I/O, your workload of indexing/searching) so it is up to you to
calibrate your node capacity. After adding nodes, you will observe that ES
scales well and can handle more shards.

Jörg


On Tue, May 13, 2014 at 11:59 AM, Paul  wrote:

> We are seeing a slow down in shard initialization speed as the number of
> shards/indices grows in our cluster.
>
> With 0-100's of indices/shards existing in the cluster a new bulk creation
> of indices up the 100's at a time is fine, we see them pass through the
> states and get a green cluster in a reasonable amount of time.
>
> As the total cluster size grows to 1000+ indices (3000+ shards) we begin
> to notice that the first rounds of initialization take longer to process,
> it seems to speed up after the first few batches, but this slow down leads
> to "failed to process cluster event (create-index [index_1112], cause
> [auto(bulk api)]) within 30s" type messages in the Master logs - the
> indices are eventually created.
>
>
> Has anyone else experienced this? (did you find the cause / way to fix?)
>
> Is this somewhat expected behaviour? - are we approaching something
> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHG8gXnPNje24sN7SzyskAYUrLEPpJpeZS9O5DZYgFdyA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shard Initialization slow down

2014-05-13 Thread Paul
This looks very interesting, thanks.


On Tuesday, May 13, 2014 11:38:27 AM UTC+1, Mark Harwood wrote:
>
> This API should give an indication on any backlog in processing the 
> cluster state: 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-pending.html
>
>
>
> On Tuesday, May 13, 2014 11:29:20 AM UTC+1, Paul wrote:
>>
>> Ok, do you know if there are clear indicators when limits are being 
>> reached?
>>
>> We don't see errors in the logs (apart from the 30s timeout) but if there 
>> are system or ES provided metrics that we can track to know when we need to 
>> scale it would be really useful.
>>
>>
>> Thanks,
>>
>> Paul.  
>>
>>
>>
>> On Tuesday, May 13, 2014 11:24:06 AM UTC+1, Mark Walkom wrote:
>>>
>>> Empty or not, there is still metadata that ES needs to maintain in the 
>>> cluster state. So the more indexes you have open the bigger that is and the 
>>> more resources required to track it.
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>
>>>
>>> On 13 May 2014 20:16, Paul  wrote:
>>>
 In testing and replicating the issue, this slow down has been seen 
 occurring with empty indices. 

 The running cluster is at present ~100 GB across 2,200 Indices with a 
 total of 13,500 shards and ~430,000,000 documents.

 We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly 
 carefully but don't think the heap is maxing out on any of the nodes when 
 this occurs.


 Thanks,

 Paul.


 On Tuesday, May 13, 2014 11:02:32 AM UTC+1, Mark Walkom wrote:
>
> Sounds like the inevitable "add more nodes" situation.
>
> How much RAM on each node, how big is your data set?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>  
>
> On 13 May 2014 19:59, Paul  wrote:
>
>> We are seeing a slow down in shard initialization speed as the number 
>> of shards/indices grows in our cluster.
>>
>> With 0-100's of indices/shards existing in the cluster a new bulk 
>> creation of indices up the 100's at a time is fine, we see them pass 
>> through the states and get a green cluster in a reasonable amount of 
>> time.
>>
>> As the total cluster size grows to 1000+ indices (3000+ shards) we 
>> begin to notice that the first rounds of initialization take longer to 
>> process, it seems to speed up after the first few batches, but this slow 
>> down leads to "failed to process cluster event (create-index 
>> [index_1112], cause [auto(bulk api)]) within 30s" type messages in the 
>> Master logs - the indices are eventually created.
>>
>>
>> Has anyone else experienced this? (did you find the cause / way to 
>> fix?)
>>
>> Is this somewhat expected behaviour? - are we approaching something 
>> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>>  
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, 
>> send an email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%
>> 40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/8bca4439-5c70-48b6-b5bd-45631e0a5fb2%40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dbdfe1ea-7b1b-4e65-bcdb-251aceab1fe0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shard Initialization slow down

2014-05-13 Thread Paul
Thanks Mark, we'll have a look at the available metrics.
 


On Tuesday, May 13, 2014 11:34:51 AM UTC+1, Mark Walkom wrote:
>
> You will want to obtain Marvel (
> http://www.elasticsearch.org/guide/en/marvel/current/) and then wait till 
> you have a history and start digging.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 13 May 2014 20:29, Paul > wrote:
>
>> Ok, do you know if there are clear indicators when limits are being 
>> reached?
>>
>> We don't see errors in the logs (apart from the 30s timeout) but if there 
>> are system or ES provided metrics that we can track to know when we need to 
>> scale it would be really useful.
>>
>>
>> Thanks,
>>
>> Paul.  
>>
>>
>>
>> On Tuesday, May 13, 2014 11:24:06 AM UTC+1, Mark Walkom wrote:
>>>
>>> Empty or not, there is still metadata that ES needs to maintain in the 
>>> cluster state. So the more indexes you have open the bigger that is and the 
>>> more resources required to track it.
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>
>>>
>>> On 13 May 2014 20:16, Paul  wrote:
>>>
 In testing and replicating the issue, this slow down has been seen 
 occurring with empty indices. 

 The running cluster is at present ~100 GB across 2,200 Indices with a 
 total of 13,500 shards and ~430,000,000 documents.

 We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly 
 carefully but don't think the heap is maxing out on any of the nodes when 
 this occurs.


 Thanks,

 Paul.


 On Tuesday, May 13, 2014 11:02:32 AM UTC+1, Mark Walkom wrote:
>
> Sounds like the inevitable "add more nodes" situation.
>
> How much RAM on each node, how big is your data set?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>  
>
> On 13 May 2014 19:59, Paul  wrote:
>
>> We are seeing a slow down in shard initialization speed as the number 
>> of shards/indices grows in our cluster.
>>
>> With 0-100's of indices/shards existing in the cluster a new bulk 
>> creation of indices up the 100's at a time is fine, we see them pass 
>> through the states and get a green cluster in a reasonable amount of 
>> time.
>>
>> As the total cluster size grows to 1000+ indices (3000+ shards) we 
>> begin to notice that the first rounds of initialization take longer to 
>> process, it seems to speed up after the first few batches, but this slow 
>> down leads to "failed to process cluster event (create-index 
>> [index_1112], cause [auto(bulk api)]) within 30s" type messages in the 
>> Master logs - the indices are eventually created.
>>
>>
>> Has anyone else experienced this? (did you find the cause / way to 
>> fix?)
>>
>> Is this somewhat expected behaviour? - are we approaching something 
>> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>>  
>> -- 
>> You received this message because you are subscribed to the Google 
>> Groups "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, 
>> send an email to elasticsearc...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40goo
>> glegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/8bca4439-5c70-48b6-b5bd-45631e0a5fb2%
 40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/9e52d337-7b5d-411b-904d-477c0806f99d%40googlegroups.com

Re: Shard Initialization slow down

2014-05-13 Thread Mark Harwood
This API should give an indication on any backlog in processing the cluster 
state: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-pending.html



On Tuesday, May 13, 2014 11:29:20 AM UTC+1, Paul wrote:
>
> Ok, do you know if there are clear indicators when limits are being 
> reached?
>
> We don't see errors in the logs (apart from the 30s timeout) but if there 
> are system or ES provided metrics that we can track to know when we need to 
> scale it would be really useful.
>
>
> Thanks,
>
> Paul.  
>
>
>
> On Tuesday, May 13, 2014 11:24:06 AM UTC+1, Mark Walkom wrote:
>>
>> Empty or not, there is still metadata that ES needs to maintain in the 
>> cluster state. So the more indexes you have open the bigger that is and the 
>> more resources required to track it.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 13 May 2014 20:16, Paul  wrote:
>>
>>> In testing and replicating the issue, this slow down has been seen 
>>> occurring with empty indices. 
>>>
>>> The running cluster is at present ~100 GB across 2,200 Indices with a 
>>> total of 13,500 shards and ~430,000,000 documents.
>>>
>>> We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly 
>>> carefully but don't think the heap is maxing out on any of the nodes when 
>>> this occurs.
>>>
>>>
>>> Thanks,
>>>
>>> Paul.
>>>
>>>
>>> On Tuesday, May 13, 2014 11:02:32 AM UTC+1, Mark Walkom wrote:

 Sounds like the inevitable "add more nodes" situation.

 How much RAM on each node, how big is your data set?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com
  

 On 13 May 2014 19:59, Paul  wrote:

> We are seeing a slow down in shard initialization speed as the number 
> of shards/indices grows in our cluster.
>
> With 0-100's of indices/shards existing in the cluster a new bulk 
> creation of indices up the 100's at a time is fine, we see them pass 
> through the states and get a green cluster in a reasonable amount of time.
>
> As the total cluster size grows to 1000+ indices (3000+ shards) we 
> begin to notice that the first rounds of initialization take longer to 
> process, it seems to speed up after the first few batches, but this slow 
> down leads to "failed to process cluster event (create-index 
> [index_1112], cause [auto(bulk api)]) within 30s" type messages in the 
> Master logs - the indices are eventually created.
>
>
> Has anyone else experienced this? (did you find the cause / way to 
> fix?)
>
> Is this somewhat expected behaviour? - are we approaching something 
> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>  
> -- 
> You received this message because you are subscribed to the Google 
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%
> 40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/8bca4439-5c70-48b6-b5bd-45631e0a5fb2%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4916fa14-5fc4-4fd2-92a2-f300acb71623%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shard Initialization slow down

2014-05-13 Thread Mark Walkom
You will want to obtain Marvel (
http://www.elasticsearch.org/guide/en/marvel/current/) and then wait till
you have a history and start digging.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 13 May 2014 20:29, Paul  wrote:

> Ok, do you know if there are clear indicators when limits are being
> reached?
>
> We don't see errors in the logs (apart from the 30s timeout) but if there
> are system or ES provided metrics that we can track to know when we need to
> scale it would be really useful.
>
>
> Thanks,
>
> Paul.
>
>
>
> On Tuesday, May 13, 2014 11:24:06 AM UTC+1, Mark Walkom wrote:
>>
>> Empty or not, there is still metadata that ES needs to maintain in the
>> cluster state. So the more indexes you have open the bigger that is and the
>> more resources required to track it.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 13 May 2014 20:16, Paul  wrote:
>>
>>> In testing and replicating the issue, this slow down has been seen
>>> occurring with empty indices.
>>>
>>> The running cluster is at present ~100 GB across 2,200 Indices with a
>>> total of 13,500 shards and ~430,000,000 documents.
>>>
>>> We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly
>>> carefully but don't think the heap is maxing out on any of the nodes when
>>> this occurs.
>>>
>>>
>>> Thanks,
>>>
>>> Paul.
>>>
>>>
>>> On Tuesday, May 13, 2014 11:02:32 AM UTC+1, Mark Walkom wrote:

 Sounds like the inevitable "add more nodes" situation.

 How much RAM on each node, how big is your data set?

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com
 web: www.campaignmonitor.com


 On 13 May 2014 19:59, Paul  wrote:

> We are seeing a slow down in shard initialization speed as the number
> of shards/indices grows in our cluster.
>
> With 0-100's of indices/shards existing in the cluster a new bulk
> creation of indices up the 100's at a time is fine, we see them pass
> through the states and get a green cluster in a reasonable amount of time.
>
> As the total cluster size grows to 1000+ indices (3000+ shards) we
> begin to notice that the first rounds of initialization take longer to
> process, it seems to speed up after the first few batches, but this slow
> down leads to "failed to process cluster event (create-index
> [index_1112], cause [auto(bulk api)]) within 30s" type messages in the
> Master logs - the indices are eventually created.
>
>
> Has anyone else experienced this? (did you find the cause / way to
> fix?)
>
> Is this somewhat expected behaviour? - are we approaching something
> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40goo
> glegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/8bca4439-5c70-48b6-b5bd-45631e0a5fb2%
>>> 40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/9e52d337-7b5d-411b-904d-477c0806f99d%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from

Re: Shard Initialization slow down

2014-05-13 Thread Paul
Ok, do you know if there are clear indicators when limits are being reached?

We don't see errors in the logs (apart from the 30s timeout) but if there 
are system or ES provided metrics that we can track to know when we need to 
scale it would be really useful.


Thanks,

Paul.  



On Tuesday, May 13, 2014 11:24:06 AM UTC+1, Mark Walkom wrote:
>
> Empty or not, there is still metadata that ES needs to maintain in the 
> cluster state. So the more indexes you have open the bigger that is and the 
> more resources required to track it.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 13 May 2014 20:16, Paul > wrote:
>
>> In testing and replicating the issue, this slow down has been seen 
>> occurring with empty indices. 
>>
>> The running cluster is at present ~100 GB across 2,200 Indices with a 
>> total of 13,500 shards and ~430,000,000 documents.
>>
>> We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly 
>> carefully but don't think the heap is maxing out on any of the nodes when 
>> this occurs.
>>
>>
>> Thanks,
>>
>> Paul.
>>
>>
>> On Tuesday, May 13, 2014 11:02:32 AM UTC+1, Mark Walkom wrote:
>>>
>>> Sounds like the inevitable "add more nodes" situation.
>>>
>>> How much RAM on each node, how big is your data set?
>>>
>>> Regards,
>>> Mark Walkom
>>>
>>> Infrastructure Engineer
>>> Campaign Monitor
>>> email: ma...@campaignmonitor.com
>>> web: www.campaignmonitor.com
>>>  
>>>
>>> On 13 May 2014 19:59, Paul  wrote:
>>>
 We are seeing a slow down in shard initialization speed as the number 
 of shards/indices grows in our cluster.

 With 0-100's of indices/shards existing in the cluster a new bulk 
 creation of indices up the 100's at a time is fine, we see them pass 
 through the states and get a green cluster in a reasonable amount of time.

 As the total cluster size grows to 1000+ indices (3000+ shards) we 
 begin to notice that the first rounds of initialization take longer to 
 process, it seems to speed up after the first few batches, but this slow 
 down leads to "failed to process cluster event (create-index 
 [index_1112], cause [auto(bulk api)]) within 30s" type messages in the 
 Master logs - the indices are eventually created.


 Has anyone else experienced this? (did you find the cause / way to fix?)

 Is this somewhat expected behaviour? - are we approaching something 
 incorrectly? (there are 3 data nodes involved, with 3 shards per index)
  
 -- 
 You received this message because you are subscribed to the Google 
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%
 40googlegroups.com
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/8bca4439-5c70-48b6-b5bd-45631e0a5fb2%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9e52d337-7b5d-411b-904d-477c0806f99d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shard Initialization slow down

2014-05-13 Thread Mark Walkom
Empty or not, there is still metadata that ES needs to maintain in the
cluster state. So the more indexes you have open the bigger that is and the
more resources required to track it.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 13 May 2014 20:16, Paul  wrote:

> In testing and replicating the issue, this slow down has been seen
> occurring with empty indices.
>
> The running cluster is at present ~100 GB across 2,200 Indices with a
> total of 13,500 shards and ~430,000,000 documents.
>
> We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly
> carefully but don't think the heap is maxing out on any of the nodes when
> this occurs.
>
>
> Thanks,
>
> Paul.
>
>
> On Tuesday, May 13, 2014 11:02:32 AM UTC+1, Mark Walkom wrote:
>>
>> Sounds like the inevitable "add more nodes" situation.
>>
>> How much RAM on each node, how big is your data set?
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 13 May 2014 19:59, Paul  wrote:
>>
>>> We are seeing a slow down in shard initialization speed as the number of
>>> shards/indices grows in our cluster.
>>>
>>> With 0-100's of indices/shards existing in the cluster a new bulk
>>> creation of indices up the 100's at a time is fine, we see them pass
>>> through the states and get a green cluster in a reasonable amount of time.
>>>
>>> As the total cluster size grows to 1000+ indices (3000+ shards) we begin
>>> to notice that the first rounds of initialization take longer to process,
>>> it seems to speed up after the first few batches, but this slow down leads
>>> to "failed to process cluster event (create-index [index_1112], cause
>>> [auto(bulk api)]) within 30s" type messages in the Master logs - the
>>> indices are eventually created.
>>>
>>>
>>> Has anyone else experienced this? (did you find the cause / way to fix?)
>>>
>>> Is this somewhat expected behaviour? - are we approaching something
>>> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%
>>> 40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/8bca4439-5c70-48b6-b5bd-45631e0a5fb2%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624beqCRUW%2BRL2%3DKE%2BqAKtnpPa4ADOj7zoTWQSO0HE8W%3Dfg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shard Initialization slow down

2014-05-13 Thread Paul
In testing and replicating the issue, this slow down has been seen 
occurring with empty indices. 

The running cluster is at present ~100 GB across 2,200 Indices with a total 
of 13,500 shards and ~430,000,000 documents.

We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly 
carefully but don't think the heap is maxing out on any of the nodes when 
this occurs.


Thanks,

Paul.


On Tuesday, May 13, 2014 11:02:32 AM UTC+1, Mark Walkom wrote:
>
> Sounds like the inevitable "add more nodes" situation.
>
> How much RAM on each node, how big is your data set?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>  
>
> On 13 May 2014 19:59, Paul > wrote:
>
>> We are seeing a slow down in shard initialization speed as the number of 
>> shards/indices grows in our cluster.
>>
>> With 0-100's of indices/shards existing in the cluster a new bulk 
>> creation of indices up the 100's at a time is fine, we see them pass 
>> through the states and get a green cluster in a reasonable amount of time.
>>
>> As the total cluster size grows to 1000+ indices (3000+ shards) we begin 
>> to notice that the first rounds of initialization take longer to process, 
>> it seems to speed up after the first few batches, but this slow down leads 
>> to "failed to process cluster event (create-index [index_1112], cause 
>> [auto(bulk api)]) within 30s" type messages in the Master logs - the 
>> indices are eventually created.
>>
>>
>> Has anyone else experienced this? (did you find the cause / way to fix?)
>>
>> Is this somewhat expected behaviour? - are we approaching something 
>> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8bca4439-5c70-48b6-b5bd-45631e0a5fb2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Shard Initialization slow down

2014-05-13 Thread Mark Walkom
Sounds like the inevitable "add more nodes" situation.

How much RAM on each node, how big is your data set?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 13 May 2014 19:59, Paul  wrote:

> We are seeing a slow down in shard initialization speed as the number of
> shards/indices grows in our cluster.
>
> With 0-100's of indices/shards existing in the cluster a new bulk creation
> of indices up the 100's at a time is fine, we see them pass through the
> states and get a green cluster in a reasonable amount of time.
>
> As the total cluster size grows to 1000+ indices (3000+ shards) we begin
> to notice that the first rounds of initialization take longer to process,
> it seems to speed up after the first few batches, but this slow down leads
> to "failed to process cluster event (create-index [index_1112], cause
> [auto(bulk api)]) within 30s" type messages in the Master logs - the
> indices are eventually created.
>
>
> Has anyone else experienced this? (did you find the cause / way to fix?)
>
> Is this somewhat expected behaviour? - are we approaching something
> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624bazPsFZH5BtX8L1HCKEiP9_jz9_YLKOPhG%2BcES6DhQSQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.