Shard Initialization slow down

2014-05-13 Thread Paul
We are seeing a slow down in shard initialization speed as the number of shards/indices grows in our cluster. With 0-100's of indices/shards existing in the cluster a new bulk creation of indices up the 100's at a time is fine, we see them pass through the states and get a green cluster in a

Re: Shard Initialization slow down

2014-05-13 Thread Mark Walkom
Sounds like the inevitable add more nodes situation. How much RAM on each node, how big is your data set? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 13 May 2014 19:59, Paul codive...@gmail.com wrote: We are

Re: Shard Initialization slow down

2014-05-13 Thread Paul
In testing and replicating the issue, this slow down has been seen occurring with empty indices. The running cluster is at present ~100 GB across 2,200 Indices with a total of 13,500 shards and ~430,000,000 documents. We have 7GB RAM and 5GB heap on the data nodes - haven't looked overly

Re: Shard Initialization slow down

2014-05-13 Thread Mark Walkom
Empty or not, there is still metadata that ES needs to maintain in the cluster state. So the more indexes you have open the bigger that is and the more resources required to track it. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web:

Re: Shard Initialization slow down

2014-05-13 Thread Paul
Ok, do you know if there are clear indicators when limits are being reached? We don't see errors in the logs (apart from the 30s timeout) but if there are system or ES provided metrics that we can track to know when we need to scale it would be really useful. Thanks, Paul. On Tuesday,

Re: Shard Initialization slow down

2014-05-13 Thread Mark Walkom
You will want to obtain Marvel ( http://www.elasticsearch.org/guide/en/marvel/current/) and then wait till you have a history and start digging. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 13 May 2014 20:29,

Re: Shard Initialization slow down

2014-05-13 Thread Mark Harwood
This API should give an indication on any backlog in processing the cluster state: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-pending.html On Tuesday, May 13, 2014 11:29:20 AM UTC+1, Paul wrote: Ok, do you know if there are clear indicators when limits are

Re: Shard Initialization slow down

2014-05-13 Thread Paul
Thanks Mark, we'll have a look at the available metrics. On Tuesday, May 13, 2014 11:34:51 AM UTC+1, Mark Walkom wrote: You will want to obtain Marvel ( http://www.elasticsearch.org/guide/en/marvel/current/) and then wait till you have a history and start digging. Regards, Mark Walkom

Re: Shard Initialization slow down

2014-05-13 Thread Paul
This looks very interesting, thanks. On Tuesday, May 13, 2014 11:38:27 AM UTC+1, Mark Harwood wrote: This API should give an indication on any backlog in processing the cluster state: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-pending.html On Tuesday,

Re: Shard Initialization slow down

2014-05-13 Thread joergpra...@gmail.com
You should create indexes before bulk indexing. First, bulk indexing works much better if all indices and their mappings are already present, the operations will run faster and without conflicts, and the cluster state updates are less frequent which reduces some noise and hiccups. Second, setting

Re: Shard Initialization slow down

2014-05-13 Thread Paul
Thanks Jörg, we've heard of others pre-creating indices, we were seeing it as a work around rather than a regular action but what you say makes it seem like something we should work with. On Tuesday, May 13, 2014 12:13:10 PM UTC+1, Jörg Prante wrote: You should create indexes before bulk