Re: Cluster discovery on Amazon EC2 problem - need urgent help

Norberto Meijome Sun, 12 Oct 2014 16:50:55 -0700

Inline below ...

On Sun, Oct 12, 2014 at 5:28 AM, Zoran Jeremic <zoran.jere...@gmail.com>
wrote:


> Hi Norberto,
>
> Thank you for your advices. This is really helpful, since I have never
> used elasticsearch in the  cluster before, and never had went live with a
> number of users. My previous experience was on ES single node and very
> small number of users, so I'm still concern how this will work. The main
> problem is that I don't know how many users I could expect, so I should be
> ready to expand the cluster if it's necessary.
>

Sure - that's one of the nice things about ES , and AWS - you can keep
tuning as you go...


>
> So far, I created a cluster of 3 m3.large instances having 3 indexes (5
> shards and 2 replicas).
> I couldn't manage to connect it with ec2 autodiscovery. The only option
> that worked for me is having one node that will be referred from other
> nodes as unicast host. I think it might work if I have one node that will
> always been on.
>

build for failure.


>
> You were right about having a keys in config. I didn't need it. Can I also
> remove this from my java application? I guess it could be removed if launch
> configuration contains IAM instance profile.
>

I don't know why your app needs AWS credentials, so I cannot really answer
that - but, in general, if the AWS library you use supports IAM profiles
then you should be able to remove hardcoded creds. YMMV.


> I also decreased zen discovery timeout to 3s.
>
>  - your master config shows master false... You want the master with
> master =true and data = false... Obviously you want  more than one master (
> if you don't have too much load start with all nodes available as data and
> master, then separate functionality as needed). Don't forget to set the
> minimum expected # nodes to n-master/2+1 to prevent split brain scenarios.
> I've set all 3 nodes as master and data, but I'm not sure that I
> understand what is the advantage of having nodes that are not master nodes.
> I know these nodes will not be elected as master, but what is the idea for
> that, and what would I get if I set master not to have data on it? Would it
> increase performance?
>

TL;DR - scalability, performance : There are certain operations which need
to be performed by master node in a timely . If your node is already too
busy handling searches, 'master operations' will suffer( and your whole
cluster will slow down ).

It is much cheaper to run separate, smaller master (and load balancer )
nodes , separate from your data nodes, than to scale up + out your data
nodes to handle all the operations.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html


>
>
>  It should work pretty well with ec2 auto discovery - unicast is a good
> starting point but unless you are statically assigning them via cloud
> formation (or manually?), it may not be worth the trouble (and it stops you
> from dynamically scaling your cluster)
> How will ES node behave in Amazon auto-scale and could it be used like I'm
> using auto scaling to meet high load? If I already have set 5 shards and 2
> replicas on previous 3 nodes, will these shards and replicas be moved to
> new nodes, and how long it might take for this? If this is what is going
> on, I guess it's not good idea to auto-scale new ES node when I have a high
> intensity of ES use, and then to turn it off later.
>

yeah, that's definitely not something that will always work with
autoscaling.
-  You can use autoscaling to ensure the minimum # of nodes is defined (ie,
automatic rebuild of killed node).
- if you know you have, say, 8 hours with 50% more traffic, you can
increase the number of nodes some time before peak, increase # of replicas
.... after the peak, reduce replica # and remove nodes... Not autoscaling
per se, but building from the get go without hardcoded hostnames will help
you do things like this.

btw, you also want to play with routing awareness, so your replicas are
distributed across different AZ.

AND beware of cost of inter-AZ traffic :) ( yes, it conflicts with the 'AZ
routing awareness')


> Sorry if these questions are too naive.
>

:) not at all!

good luck


>
> Thanks,
> Zoran
>
>
>
> On Friday, 10 October 2014 20:43:02 UTC-7, Norberto Meijome wrote:
>>
>> Zoran, good to hear it is working now.
>>
>> It should work pretty well with ec2 auto discovery - unicast is a good
>> starting point but unless you are statically assigning them via cloud
>> formation (or manually?), it may not be worth the trouble (and it stops you
>> from dynamically scaling your cluster)
>>
>> - make sure u have the ec2 plugin installed.
>> - if you use iam profiles, you don't  need a key specified in the config
>> (this will override the key from the Profile). Also make sure you
>> manually  test your profile is applied properly ( AWS CLI is a good
>> agnostic tool for this).
>> - reduce the zen discovery timeout - it seems that it will always start w
>> zen then failover to ec2 and it can take 30secs or so to timeout... ( maybe
>> it was my bad config, I used to have zen when I was moving from unicast to
>> ec2 disco ...I don't remember finding an option to disabling zen disco).
>>
>> - the default logs should show you enough info to debug any of this.
>>
>> - your master config shows master false... You want the master with
>> master =true and data = false... Obviously you want  more than one master (
>> if you don't have too much load start with all nodes available as data and
>> master, then separate functionality as needed). Don't forget to set the
>> minimum expected # nodes to n-master/2+1 to prevent split brain scenarios.
>> On 11/10/2014 1:38 pm, "Zoran Jeremic" <zoran....@gmail.com> wrote:
>>
>>> Hi David,
>>>
>>> Thank you for your advices. It really helped me to solve the issue and
>>> make it works.
>>> At the end I had to leave these two:
>>>  discovery.zen.ping.multicast.enabled: false
>>>  discovery.zen.ping.unicast.hosts: ["10.185.210.54[9300-9400]","
>>> 10.101.176.236[9300-9400]"]
>>>
>>> and to remove:
>>> network.publish_host: 255.255.255.255
>>>
>>> And it got work finally. What turned to be the biggest problem is what
>>> you mentioned at the beginning, missing spaces after ":", missing spaces at
>>> the beginning of line  and some extra spaces after #. I thought that : is
>>> delimiter, and it doesn't have to be followed by space. Strange thing is
>>> that if I have such problems in elasticsearch.yml, there is no logs that
>>> indicates that there is some problem. It doesn't log anything and can't
>>> start elasticsearch, or just ignore wrong properties.
>>>
>>> Thanks,
>>> Zoran
>>>
>>> On Friday, 10 October 2014 14:11:00 UTC-7, David Pilato wrote:
>>>>
>>>> Not sure but may be related to public/private IP.
>>>> May be debug logs will give you more insights?
>>>>
>>>> --
>>>> David ;-)
>>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>>
>>>> Le 10 oct. 2014 à 22:40, Zoran Jeremic <zoran....@gmail.com> a écrit :
>>>>
>>>> Hi David,
>>>>
>>>> Thank you for your quick response. That was great guess about the space
>>>> after ":". It was really something that made a problem, so I'm now a step
>>>> forward. It seems that it's trying to establish the connection, but there
>>>> are a plenty of exceptions stating that Nework is unreachable. Why this
>>>> exception if I can telnet between nodes on 9300?
>>>>
>>>> [2014-10-10 20:22:12,184][WARN ][transport.netty          ] [Joey
>>>> Bailey] exception caught on transport layer [[id: 0x5541474b]], closing
>>>> connection
>>>> java.net.SocketException: Network is unreachable
>>>>     at sun.nio.ch.Net.connect0(Native Method)
>>>>     at sun.nio.ch.Net.connect(Net.java:465)
>>>>     at sun.nio.ch.Net.connect(Net.java:457)
>>>>     at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)
>>>>     at org.elasticsearch.common.netty.channel.socket.nio.NioClientS
>>>> ocketPipelineSink.connect(NioClientSocketPipelineSink.java:108)
>>>>     at org.elasticsearch.common.netty.channel.socket.nio.NioClientS
>>>> ocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:70)
>>>>     at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
>>>> sendDownstream(DefaultChannelPipeline.java:574)
>>>>     at org.elasticsearch.common.netty.channel.Channels.connect(
>>>> Channels.java:634)
>>>>     at org.elasticsearch.common.netty.channel.AbstractChannel.conne
>>>> ct(AbstractChannel.java:207)
>>>>     at org.elasticsearch.common.netty.bootstrap.ClientBootstrap.
>>>> connect(ClientBootstrap.java:229)
>>>>     at org.elasticsearch.common.netty.bootstrap.ClientBootstrap.
>>>> connect(ClientBootstrap.java:182)
>>>>     at org.elasticsearch.transport.netty.NettyTransport.connectToCh
>>>> annels(NettyTransport.java:705)
>>>>     at org.elasticsearch.transport.netty.NettyTransport.connectToNo
>>>> de(NettyTransport.java:647)
>>>>     at org.elasticsearch.transport.netty.NettyTransport.connectToNo
>>>> de(NettyTransport.java:615)
>>>>     at org.elasticsearch.transport.TransportService.connectToNode(T
>>>> ransportService.java:129)
>>>>     at org.elasticsearch.cluster.service.InternalClusterService$Upd
>>>> ateTask.run(InternalClusterService.java:404)
>>>>     at org.elasticsearch.common.util.concurrent.PrioritizedEsThread
>>>> PoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedE
>>>> sThreadPoolExecutor.java:134)
>>>>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1145)
>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:615)
>>>>     at java.lang.Thread.run(Thread.java:744)
>>>> [2014-10-10 20:22:12,185][WARN ][transport.netty          ] [Joey
>>>> Bailey] exception caught on transport layer [[id: 0x9e80cd79]], closing
>>>> connection
>>>>
>>>> On Friday, 10 October 2014 12:21:18 UTC-7, David Pilato wrote:
>>>>>
>>>>> I might be wrong but may be you should add a space after each ":" char
>>>>> in yml file.
>>>>>
>>>>> It sounds like multicast is not disabled and that ec2 discovery is not
>>>>> used.
>>>>>
>>>>> Some lines should not be added:
>>>>>
>>>>> Multicast disable
>>>>> Unicast list of nodes
>>>>>
>>>>> HTH
>>>>>
>>>>> --
>>>>> David ;-)
>>>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>>>
>>>>> Le 10 oct. 2014 à 19:57, Zoran Jeremic <zoran....@gmail.com> a écrit :
>>>>>
>>>>>
>>>>> Hi guys,
>>>>>
>>>>> I need an urgent help to setup Elasticsearch cluster on Amazon EC2
>>>>> instances as I have to launch an application within a week. I'm trying 
>>>>> this
>>>>> for the last three days without success. I tried to follow many
>>>>> instructions, created instances all over again and still nothing. I can
>>>>> telnet instances on 9300. I added security group ES2 having a port range
>>>>> 0-65535 and also individual instances by private IP addresses with range
>>>>> 9200-9400. Nodes can't discover each other,and it seems that both nodes 
>>>>> are
>>>>> created on their own regardless the fact that cluster node info indicates
>>>>> that good elasticsearch.yml is used. For example, cluster name is the one 
>>>>> I
>>>>> added in elasticsearch.yml, but node name is generic one.
>>>>> I hope somebody will have some idea if I missed something here.
>>>>>
>>>>> Here are other details:
>>>>>
>>>>> My IAM policy is:
>>>>> ###########################
>>>>>
>>>>> {
>>>>>   "Version": "2012-10-17",
>>>>>   "Statement": [
>>>>>     {
>>>>>       "Sid": "Stmt1412960658000",
>>>>>       "Effect": "Allow",
>>>>>       "Action": [
>>>>>         "ec2:DescribeInstances"
>>>>>       ],
>>>>>       "Resource": [
>>>>>         "*"
>>>>>       ]
>>>>>     }
>>>>>   ]
>>>>> }
>>>>>
>>>>>
>>>>> Cluster configurations are as follows:
>>>>> ###################################################
>>>>> ######################Master node configuration
>>>>>
>>>>> cluster.name: elasticsearch
>>>>> node.name: "Slave_node"
>>>>> node.master: false
>>>>>
>>>>> discovery.ec2.availability_zones: us-east-1
>>>>> discovery.ec2.ping_timeout: 30s
>>>>> cloud.aws.protocol:http
>>>>> plugin.mandatory:cloud-aws
>>>>> discovery.zen.ping.multicast.enabled:false
>>>>> discovery.ec2.groups:ES2
>>>>> #discovery.ec2.tag.type:ElasticsearchCluster
>>>>> network.publish_host:255.255.255.255
>>>>> discovery.type:ec2
>>>>> cloud.aws.access_key:<myaccesskey>
>>>>> cloud.aws.secret_key:<mysecretkey>
>>>>> discovery.zen.ping.unicast.hosts:["10.185.210.54[9300-9400]",
>>>>> "10.101.176.236[9300-9400]"]
>>>>> cloud.node.auto_attributes:true
>>>>>
>>>>>
>>>>> ###############################Slave node configuration
>>>>>
>>>>> cluster.name: elasticsearch
>>>>> node.name: "Slave_node"
>>>>> node.master: false
>>>>>
>>>>> discovery.ec2.availability_zones: us-east-1
>>>>> discovery.ec2.ping_timeout: 30s
>>>>> cloud.aws.protocol:http
>>>>> plugin.mandatory:cloud-aws
>>>>> discovery.zen.ping.multicast.enabled:false
>>>>> discovery.ec2.groups:ES2
>>>>> #discovery.ec2.tag.type:ElasticsearchCluster
>>>>> network.publish_host:255.255.255.255
>>>>> discovery.type:ec2
>>>>> cloud.aws.access_key:<myaccesskey>
>>>>> cloud.aws.secret_key:<mysecretkey>
>>>>> discovery.zen.ping.unicast.hosts:["10.185.210.54[9300-9400]",
>>>>> "10.101.176.236[9300-9400]"]
>>>>> cloud.node.auto_attributes:true
>>>>>
>>>>>
>>>>>
>>>>> #############################################################
>>>>> ##############TRACE LOG FROM SLAVE NODE
>>>>> [2014-10-10 17:21:30,554][INFO ][node                     ] [Gabriel
>>>>> the Air-Walker] started
>>>>> [2014-10-10
>>>>>  17:21:30,554][DEBUG][cluster.service          ] [Gabriel the
>>>>> Air-Walker] processing [updating local node id]: done applying
>>>>> updated
>>>>> cluster_state (version: 3)
>>>>> [2014-10-10 17:21:40,504][DEBUG][cluster.service          ] [Gabriel
>>>>> the Air-Walker] processing [routing-table-updater]: execute
>>>>> [2014-10-10
>>>>>  17:21:40,505][DEBUG][cluster.service          ] [Gabriel the
>>>>> Air-Walker] processing [routing-table-updater]: no change in
>>>>> cluster_state
>>>>> [2014-10-10
>>>>> 17:21:44,122][DEBUG][plugins                  ] [Gabriel the Air-
>>>>> Walker]
>>>>>  [/usr/share/elasticsearch/plugins/cloud-aws/_site] directory does not
>>>>> exist.
>>>>> [2014-10-10 17:21:44,123][DEBUG][plugins                  ]
>>>>> [Gabriel the Air-Walker]
>>>>> [/usr/share/elasticsearch/plugins/mapper-attachments/_site] directory
>>>>> does not exist.
>>>>> [2014-10-10 17:22:04,288][INFO ][node                     ] [Gabriel
>>>>> the Air-Walker] stopping ...
>>>>> [<span style="color: #066;" class=
>>>>>
>>>>> ...
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to elasticsearc...@googlegroups.com.
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/elasticsearch/5591c760-1a82-46ca-ab22-98fb6982da95%40goo
>>>> glegroups.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/5591c760-1a82-46ca-ab22-98fb6982da95%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/69b2d7a9-0d8c-472b-8c69-2af60d15b501%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/69b2d7a9-0d8c-472b-8c69-2af60d15b501%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/c92b0453-6697-479b-9bb1-57a5246195a4%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/c92b0453-6697-479b-9bb1-57a5246195a4%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Norberto 'Beto' Meijome

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4%2B2vqa6R7%3DB3NPbEFv007QEYG5xJFaVYe9XChB4DEP%3DJQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Cluster discovery on Amazon EC2 problem - need urgent help

Reply via email to