subject:"Stress Free Guide To Expanding a Cluster"

Re: Stress Free Guide To Expanding a Cluster

2014-06-25 Thread Michael Hart

Try setting "indices.recovery.max_bytes_per_sec" much higher for faster 
recovery. The default is 20mb/s, and there's a bug in versions prior to 1.2 
that rate limit to even lower than that. You didn't specify how big your 
indices are, but I can fairly accurately predict how long it'll take for 
the cluster to go green with that parameter. 

mike

On Wednesday, June 25, 2014 8:20:02 AM UTC-4, Nikolas Everett wrote:
>
>
>
>
> On Wed, Jun 25, 2014 at 8:05 AM, James Carr  > wrote:
>
>> I launched two new EC2 instances to join the cluster and watched. Some
>> shards began relocating, no big deal. Six hours later I checked in and
>> some shards were still locating, one shard was recovering. Weird but
>> whatever... the cluster health is still green and searches are working
>> fine.
>
>
> I add new nodes every once in a while and it can take a few hours for 
> everything to balance out, but six hours is a bit long.  Its possible.  Do 
> you have graphs of the count of relocating shards?  Something like this can 
> really help you figure out if everything balanced out at some point and 
> then unbalanced.  Example: 
> http://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Elasticsearch%20cluster%20eqiad&h=elastic1001.eqiad.wmnet&r=hour&z=default&jr=&js=&st=1403698335&v=0&m=es_relocating_shards&vl=shards&ti=es_relocating_shards&z=large
>
> Then I got an alert at 2:30am that the cluster state is now
>> yellow and find that we have 3 shards marked as recovering and 2
>> shards that unassigned. The cluster still technically works but 24
>> hours later after the new nodes were added I feel like my only choice
>> to get a green cluster again will be to simply launch 5 fresh nodes
>> and replay all the data from backups into it. Ugh.
>>
>
> This sounds like one of the nodes bounced.  It can take a long time to 
> recover from that.  Its something that is being worked on.  Check the logs 
> and see if you see anything about it.
>
> One thing to make sure of is that you set the number of master nodes 
> correctly on all nodes.  If you have five master eligible nodes then set it 
> to 3.  If the two new nodes aren't master eligible (you have three master 
> eligible nodes) then set it to 2.
>  
>
>> SERIOUSLY! What can I do to prevent this? I feel like I am missing
>> something because I always heard the strength of elasticsearch is its
>> ease of scaling out but it feels like every time I try it falls to the
>> floor. :-(
>>
>
> Its always been pretty painless for me.  I did have trouble when I added 
> nodes that were broken: one time I added nodes without SSDs to a cluster 
> with SSDs.  Another time I didn't set the heap size on the new nodes and 
> they worked until some shards moved to them.  Then they fell over.
>
> Nik
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/17a60021-e0bc-4806-8573-f37a9ef91b89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Stress Free Guide To Expanding a Cluster

2014-06-25 Thread Nikolas Everett

On Wed, Jun 25, 2014 at 8:05 AM, James Carr  wrote:

> I launched two new EC2 instances to join the cluster and watched. Some
> shards began relocating, no big deal. Six hours later I checked in and
> some shards were still locating, one shard was recovering. Weird but
> whatever... the cluster health is still green and searches are working
> fine.

I add new nodes every once in a while and it can take a few hours for
everything to balance out, but six hours is a bit long.  Its possible.  Do
you have graphs of the count of relocating shards?  Something like this can
really help you figure out if everything balanced out at some point and
then unbalanced.  Example:
http://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Elasticsearch%20cluster%20eqiad&h=elastic1001.eqiad.wmnet&r=hour&z=default&jr=&js=&st=1403698335&v=0&m=es_relocating_shards&vl=shards&ti=es_relocating_shards&z=large

Then I got an alert at 2:30am that the cluster state is now
> yellow and find that we have 3 shards marked as recovering and 2
> shards that unassigned. The cluster still technically works but 24
> hours later after the new nodes were added I feel like my only choice
> to get a green cluster again will be to simply launch 5 fresh nodes
> and replay all the data from backups into it. Ugh.
>

This sounds like one of the nodes bounced.  It can take a long time to
recover from that.  Its something that is being worked on.  Check the logs
and see if you see anything about it.

One thing to make sure of is that you set the number of master nodes
correctly on all nodes.  If you have five master eligible nodes then set it
to 3.  If the two new nodes aren't master eligible (you have three master
eligible nodes) then set it to 2.

> SERIOUSLY! What can I do to prevent this? I feel like I am missing
> something because I always heard the strength of elasticsearch is its
> ease of scaling out but it feels like every time I try it falls to the
> floor. :-(
>

Its always been pretty painless for me.  I did have trouble when I added
nodes that were broken: one time I added nodes without SSDs to a cluster
with SSDs.  Another time I didn't set the heap size on the new nodes and
they worked until some shards moved to them.  Then they fell over.

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0CNzBEv6HC8J-P91qHS46Micb7VjmO2LTXN4JY2QGCkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Stress Free Guide To Expanding a Cluster

2014-06-25 Thread James Carr

Earlier this week we discovered that our three node elasticsearch
cluster needed to be expanded as it was getting dangerously close to
maximum capacity. I was nervous about this and read up the best I
could on best practices to doing this. The only information I seemed
to be able to find is to ensure that the new nodes cannot be elected
as masters when they join to avoid a split brain scenario. Fair
enough.

I launched two new EC2 instances to join the cluster and watched. Some
shards began relocating, no big deal. Six hours later I checked in and
some shards were still locating, one shard was recovering. Weird but
whatever... the cluster health is still green and searches are working
fine. Then I got an alert at 2:30am that the cluster state is now
yellow and find that we have 3 shards marked as recovering and 2
shards that unassigned. The cluster still technically works but 24
hours later after the new nodes were added I feel like my only choice
to get a green cluster again will be to simply launch 5 fresh nodes
and replay all the data from backups into it. Ugh.

SERIOUSLY! What can I do to prevent this? I feel like I am missing
something because I always heard the strength of elasticsearch is its
ease of scaling out but it feels like every time I try it falls to the
floor. :-(

Thanks!
James

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJreXKD3Wuyiq5XxGdSWyj3a%3DM2Xd5GQxZ9J3EywoT-OP52qFA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Stress Free Guide To Expanding a Cluster

Re: Stress Free Guide To Expanding a Cluster

Stress Free Guide To Expanding a Cluster

3 matches

Site Navigation

Mail list logo

Footer information