>>*I launch another basic EC2 CentOS machine and do the exact same commands except that I don't install zookeeper and I only run "bin/storm nimbus &" at the end*<<
Why would you run nimbus on two machines? Yes, by default Hortonworks provides one supervisor node par machine, so I am trying to add new EC2 machine with supervisor role. And it worked upto some extent but still not fully functional. I am still testing its linear scalability. On Tue, Sep 9, 2014 at 5:15 PM, Stephen Hartzell <hartzell.step...@gmail.com > wrote: > Vikas, > > > I've tried to use the HortonWorks distribution, but that only provides > one supervisor and nimbus on one virtual machine. I'm excited to hear that > you have storm working on AWS EC2 machines because that is exactly what I > am trying to do! Right now we're still in the development stage, so all we > are trying to do is to have one worker machine connect to one nimbus > machine. So far we haven't got this work. > > Although it might be lengthy, let me go ahead and post the commands I'm > using to setup the nimbus machine. > > > *I launch a basic EC2 CentOS machine with ports 6627 and 8080 open to TCP > connections (it's public IP is 54.68.149.181)* > > sudo yum update > > sudo yum install libuuid-devel gcc gcc-c++ kernel-devel > > *# Install zeromq* > > wget http://download.zeromq.org/zeromq-2.1.7.tar.gz > > tar –zxvf zeromq-2.1.7.tar.gz > > cd zeromq-2.1.7.tar.gz > > ./configure > > make > > sudo make install > > cd ~ > > *# Install jzmq* > > sudo yum install git java-devel libtool > > export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.x.x86_64 *# JDK > required for jzmq in the configuration stage* > > cd $JAVA_HOME > > sudo ln –s include include *# JDK > include directory has headers required for jzmq* > > cd ~ > > git clone https://github.com/nathanmarz/jzmq.git > > cd jzmq/src/ > > CLASSPATH=.:./.:$CLASSPATH > > touch classdist_noinst.stamp > > javac -d . org/zeromq/ZMQ.java org/zeromq/ZMQException.java > org/zeromq/ZMQQueue.java org/zeromq/ZMQForwarder.java > org/zeromq/ZMQStreamer.java > > cd ~/jzmq > > ./autogen.sh > > ./configure > > make > > sudo make install > > cd ~ > > *# Download zookeeper* > > wget > http://mirror.cc.columbia.edu/pub/software/apache/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz > > tar -zxvf zookeeper-3.4.6.tar.gz > > cd zookeeper-3.4.6 > > vi conf/zoo.cfg > > > tickTime=2000 > > dataDir=/tmp/zookeeper > > clientPort=2181 > > > mkdir /tmp/zookeeper > bin/zkServer.sh start > bin/zkCli.sh -server 54.68.149.181:2181 > > > *# Download storm and modify configuration file (changes over default.yaml > shown in bold)* > > https://www.dropbox.com/s/fl4kr7w0oc8ihdw/storm-0.8.2.zipunzip > storm-0.8.2.zip > <https://www.dropbox.com/s/fl4kr7w0oc8ihdw/storm-0.8.2.zipunzipstorm-0.8.2.zip> > > cd storm-0.8.2.zip > > vi conf/storm.yaml > > > # Licensed to the Apache Software Foundation (ASF) under one >> >> # or more contributor license agreements. See the NOTICE file >> >> # distributed with this work for additional information >> >> # regarding copyright ownership. The ASF licenses this file >> >> # to you under the Apache License, Version 2.0 (the >> >> # "License"); you may not use this file except in compliance >> >> # with the License. You may obtain a copy of the License at >> >> # >> >> # http:# www.apache.org/licenses/LICENSE-2.0 >> >> # >> >> # Unless required by applicable law or agreed to in writing, software >> >> # distributed under the License is distributed on an "AS IS" BASIS, >> >> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. >> >> # See the License for the specific language governing permissions and >> >> # limitations under the License. >> >> >>> >>> ########### These all have default values as shown >> >> ########### Additional configuration goes into storm.yaml >> >> >>> java.library.path: "/usr/local/lib:/opt/local/lib:/usr/lib: >>> */home/ec2-user*" >> >> >>> ### storm.* configs are general configurations >> >> # the local dir is where jars are kept >> >> storm.local.dir: "storm-local" >> >> storm.zookeeper.servers: >> >> - "*54.68.149.181*" >> >> storm.zookeeper.port: 2181 >> >> storm.zookeeper.root: "/storm" >> >> storm.zookeeper.session.timeout: 20000 >> >> storm.zookeeper.connection.timeout: 15000 >> >> storm.zookeeper.retry.times: 5 >> >> storm.zookeeper.retry.interval: 1000 >> >> storm.zookeeper.retry.intervalceiling.millis: 30000 >> >> storm.cluster.mode: "distributed" # can be distributed or local >> >> storm.local.mode.zmq: false >> >> storm.thrift.transport: >>> "backtype.storm.security.auth.SimpleTransportPlugin" >> >> storm.messaging.transport: "backtype.storm.messaging.zmq" >> >> >>> ### nimbus.* configs are for the master >> >> nimbus.host: "*54.68.149.181*" >> >> nimbus.thrift.port: 6627 >> >> nimbus.childopts: "-Xmx1024m" >> >> nimbus.task.timeout.secs: 30 >> >> nimbus.supervisor.timeout.secs: 60 >> >> nimbus.monitor.freq.secs: 10 >> >> nimbus.cleanup.inbox.freq.secs: 600 >> >> nimbus.inbox.jar.expiration.secs: 3600 >> >> nimbus.task.launch.secs: 120 >> >> nimbus.reassign: true >> >> nimbus.file.copy.expiration.secs: 600 >> >> nimbus.topology.validator: >>> "backtype.storm.nimbus.DefaultTopologyValidator" >> >> >>> ### ui.* configs are for the master >> >> ui.port: 8080 >> >> ui.childopts: "-Xmx768m" >> >> >>> logviewer.port: 8000 >> >> logviewer.childopts: "-Xmx128m" >> >> logviewer.appender.name: "A1" >> >> >>> >>> drpc.port: 3772 >> >> drpc.worker.threads: 64 >> >> drpc.queue.size: 128 >> >> drpc.invocations.port: 3773 >> >> drpc.request.timeout.secs: 600 >> >> drpc.childopts: "-Xmx768m" >> >> >>> transactional.zookeeper.root: "/transactional" >> >> transactional.zookeeper.servers: null >> >> transactional.zookeeper.port: null >> >> >>> ### supervisor.* configs are for node supervisors >> >> # Define the amount of workers that can be run on this machine. Each >>> worker is assigned a port to use for communication >> >> supervisor.slots.ports: >> >> - 6700 >> >> - 6701 >> >> - 6702 >> >> - 6703 >> >> supervisor.childopts: "-Xmx256m" >> >> #how long supervisor will wait to ensure that a worker process is started >> >> supervisor.worker.start.timeout.secs: 120 >> >> #how long between heartbeats until supervisor considers that worker dead >>> and tries to restart it >> >> supervisor.worker.timeout.secs: 30 >> >> #how frequently the supervisor checks on the status of the processes it's >>> monitoring and restarts if necessary >> >> supervisor.monitor.frequency.secs: 3 >> >> #how frequently the supervisor heartbeats to the cluster state (for >>> nimbus) >> >> supervisor.heartbeat.frequency.secs: 5 >> >> supervisor.enable: true >> >> >>> ### worker.* configs are for task workers >> >> worker.childopts: "-Xmx768m" >> >> worker.heartbeat.frequency.secs: 1 >> >> >>> task.heartbeat.frequency.secs: 3 >> >> task.refresh.poll.secs: 10 >> >> >>> zmq.threads: 1 >> >> zmq.linger.millis: 5000 >> >> zmq.hwm: 0 >> >> >>> >>> storm.messaging.netty.server_worker_threads: 1 >> >> storm.messaging.netty.client_worker_threads: 1 >> >> storm.messaging.netty.buffer_size: 5242880 #5MB buffer >> >> storm.messaging.netty.max_retries: 30 >> >> storm.messaging.netty.max_wait_ms: 1000 >> >> storm.messaging.netty.min_wait_ms: 100 >> >> >>> ### topology.* configs are for specific executing storms >> >> topology.enable.message.timeouts: true >> >> topology.debug: false >> >> topology.optimize: true >> >> topology.workers: 1 >> >> topology.acker.executors: null >> >> topology.tasks: null >> >> # maximum amount of time a message has to complete before it's considered >>> failed >> >> topology.message.timeout.secs: 30 >> >> topology.skip.missing.kryo.registrations: false >> >> topology.max.task.parallelism: null >> >> topology.max.spout.pending: null >> >> topology.state.synchronization.timeout.secs: 60 >> >> topology.stats.sample.rate: 0.05 >> >> topology.builtin.metrics.bucket.size.secs: 60 >> >> topology.fall.back.on.java.serialization: true >> >> topology.worker.childopts: null >> >> topology.executor.receive.buffer.size: 1024 #batched >> >> topology.executor.send.buffer.size: 1024 #individual messages >> >> topology.receiver.buffer.size: 8 # setting it too high causes a lot of >>> problems (heartbeat thread gets starved, throughput plummets) >> >> topology.transfer.buffer.size: 1024 # batched >> >> topology.tick.tuple.freq.secs: null >> >> topology.worker.shared.thread.pool.size: 4 >> >> topology.disruptor.wait.strategy: >>> "com.lmax.disruptor.BlockingWaitStrategy" >> >> topology.spout.wait.strategy: >>> "backtype.storm.spout.SleepSpoutWaitStrategy" >> >> topology.sleep.spout.wait.strategy.time.ms: 1 >> >> topology.error.throttle.interval.secs: 10 >> >> topology.max.error.report.per.interval: 5 >> >> topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory" >> >> topology.tuple.serializer: >>> "backtype.storm.serialization.types.ListDelegateSerializer" >> >> topology.trident.batch.emit.interval.millis: 500 >> >> >>> dev.zookeeper.path: "/tmp/dev-storm-zookeeper" >> >> > > bin/storm nimbus & > > bin/storm supervisor & > > bin/storm ui & > > *I launch another basic EC2 CentOS machine and do the exact same commands > except that I don't install zookeeper and I only run "bin/storm nimbus &" > at the end* > > Any thoughts would be greatly appreciated. Thanks so much for all of your > help. I'm sure someone else has done this before! > > On Mon, Sep 8, 2014 at 11:14 PM, Vikas Agarwal <vi...@infoobjects.com> > wrote: > >> Although, implementing the Storm cluster manually would be really nice >> learning, I would suggest using HortonWorks distribution which comes with >> Storm as OOTB solution and you can configure everything from Ambari UI. We >> are using Storm on Amazon EC2 machine, though it is right now in beta >> stage. We are going to move to production in coming 2-3 months. >> >> >> On Tue, Sep 9, 2014 at 5:53 AM, Stephen Hartzell < >> hartzell.step...@gmail.com> wrote: >> >>> All, >>> >>> I implemented the suggestions given by Parh and Harsha. I am now using >>> the default.yaml but I changed the storm.zookeeper.servers to the nimbus >>> machine's ip address: 54.68.149.181. I also changed the nimbus.host to >>> 54.68.149.181. I also opened up port 6627. Now, the UI web page gives the >>> following error: org.apache.thrift7.transport.TTransportException: >>> java.net.ConnectException: Connection refused >>> >>> You should be able to see the error it gives by going to the web page >>> yourself at: http://54.68.149.181:8080. I am only using this account to >>> test and see if I can even get storm to work, so these machines are only >>> for testing. Perhaps someone could tell me what the storm.yaml file should >>> look like for this setup? >>> >>> -Thanks, Stephne >>> >>> On Mon, Sep 8, 2014 at 7:41 PM, Stephen Hartzell < >>> hartzell.step...@gmail.com> wrote: >>> >>>> I'm getting kind of confused by the storm.yaml file. Should I be using >>>> the default.yaml and just modify the zookeeper and nimbus ip, or should I >>>> use a bran new storm.yaml? >>>> >>>> My nimbus machine has the ip address: 54.68.149.181. My zookeeper is >>>> on the nimbus machine. what should the storm.yaml look like on my worker >>>> and nimbus machine? Will the storm.yaml be the same on my worker and nimbus >>>> machine? I am not trying to do anything fancy, I am just trying to get a >>>> very basic cluster up and running. >>>> >>>> -Thanks, Stephen >>>> >>>> On Mon, Sep 8, 2014 at 7:00 PM, Stephen Hartzell < >>>> hartzell.step...@gmail.com> wrote: >>>> >>>>> All Thanks so much for your help. I cannot tell you how much I >>>>> appreciate it. I'm going to try out your suggestions and keep banging my >>>>> head again the wall : D. I've spent an enormous amount of time trying to >>>>> get this to work. I'll let you know what happens after I try to implement >>>>> your suggestions. It would be really cool if someone had a tutorial that >>>>> detailed this part. (I'll make it myself if I ever get this to work!) It >>>>> seems like trying to get a two-machine cluster setup on AWS would be a >>>>> VERY >>>>> common use-case. I've read and watched everything I can on the topic and >>>>> nothing got it working for me! >>>>> >>>>> On Mon, Sep 8, 2014 at 6:54 PM, Parth Brahmbhatt < >>>>> pbrahmbh...@hortonworks.com> wrote: >>>>> >>>>>> The worker connects to the thrift port and not the ui port. You need >>>>>> to open port 6627 or whatever is the value being set in storm.yaml using >>>>>> property “nimbus.thrift.port”. >>>>>> >>>>>> Based on the configuration that you have pointed so far it seems your >>>>>> nimbus host has nimbus,ui,supervisor working because you actually have >>>>>> zookeeper running locally on that host. As Harsha pointed out you need to >>>>>> change it to a value that is the public ip instead of loopback interface. >>>>>> >>>>>> Thanks >>>>>> Parth >>>>>> >>>>>> >>>>>> On Sep 8, 2014, at 3:42 PM, Harsha <st...@harsha.io> wrote: >>>>>> >>>>>> storm.zookeeper.servers: >>>>>> - "127.0.0.1" >>>>>> nimbus.host: "127.0.0.1" ( *127.0.0.1 causes to bind a loopback >>>>>> interface , instead either use your public ip or 0.0.0.0*) >>>>>> storm.local.dir: /tmp/storm ( I* recommend this to move to a >>>>>> different folder probably /home/storm, /tmp/storm will get deleted if >>>>>> your >>>>>> machine is restarted)* >>>>>> >>>>>> make sure you zookeeper is also listening in 0.0.0.0 or public ip not >>>>>> 127.0.0.1. >>>>>> >>>>>> "No, I cannot ping my host which has a public ip address of >>>>>> 54.68.149.181" >>>>>> you are not able to reach this ip form worker node but able to access >>>>>> the UI using it? >>>>>> -Harsha >>>>>> >>>>>> On Mon, Sep 8, 2014, at 03:34 PM, Stephen Hartzell wrote: >>>>>> >>>>>> Harsha, >>>>>> >>>>>> The storm.yaml on the host machine looks like this: >>>>>> >>>>>> storm.zookeeper.servers: >>>>>> - "127.0.0.1" >>>>>> >>>>>> >>>>>> nimbus.host: "127.0.0.1" >>>>>> >>>>>> storm.local.dir: /tmp/storm >>>>>> >>>>>> >>>>>> The storm.yaml on the worker machine looks like this: >>>>>> >>>>>> storm.zookeeper.servers: >>>>>> - "54.68.149.181" >>>>>> >>>>>> >>>>>> nimbus.host: "54.68.149.181" >>>>>> >>>>>> storm.local.dir: /tmp/storm >>>>>> >>>>>> No, I cannot ping my host which has a public ip address of >>>>>> 54.68.149.181 although I can connect to the UI web page when it is >>>>>> hosted. >>>>>> I don't know how I would go about connecting to zookeeper on the nimbus >>>>>> host. >>>>>> -Thanks, Stephen >>>>>> >>>>>> >>>>>> On Mon, Sep 8, 2014 at 6:28 PM, Harsha <st...@harsha.io> wrote: >>>>>> >>>>>> >>>>>> There aren't any errors in worker machine supervisor logs. Are you >>>>>> using the same storm.yaml for both the machines and also are you able to >>>>>> ping your nimbus host or connect to zookeeper on nimbus host. >>>>>> -Harsha >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Sep 8, 2014, at 03:24 PM, Stephen Hartzell wrote: >>>>>> >>>>>> Harsha, >>>>>> >>>>>> Thanks so much for getting back with me. I will check the logs, but >>>>>> I don't seem to get any error messages. I have a nimbus AWS machine with >>>>>> zookeeper on it and a worker AWS machine. >>>>>> >>>>>> On the nimbus machine I start the zookeeper and then I run: >>>>>> >>>>>> bin/storm nimbus & >>>>>> bin/storm supervisor & >>>>>> bin/storm ui >>>>>> >>>>>> On the worker machine I run: >>>>>> bin/storm supervisor >>>>>> >>>>>> When I go to the UI page, I only see 1 supervisor (the one on the >>>>>> nimbus machine). So apparently, the worker machine isn't "registering" >>>>>> with >>>>>> the nimbus machine. >>>>>> >>>>>> On Mon, Sep 8, 2014 at 6:16 PM, Harsha <st...@harsha.io> wrote: >>>>>> >>>>>> >>>>>> Hi Stephen, >>>>>> What are the issues you are seeing. >>>>>> "How do worker machines "know" how to connect to nimbus? Is it in the >>>>>> storm configuration file" >>>>>> >>>>>> Yes. make sure you the supervisor(worker) , nimbus nodes are able to >>>>>> connect to your zookeeper cluster. >>>>>> Check your logs under storm_inst/logs/ for any errors when you try to >>>>>> start nimbus or supervisors. >>>>>> If you are installing it manually try following these steps if you >>>>>> are not already done. >>>>>> >>>>>> http://www.michael-noll.com/tutorials/running-multi-node-storm-cluster/ >>>>>> -Harsha >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Sep 8, 2014, at 03:01 PM, Stephen Hartzell wrote: >>>>>> >>>>>> All, >>>>>> >>>>>> I would greatly appreciate any help that anyone would afford. I've >>>>>> been trying to setup a storm cluster on AWS for a few weeks now on centOS >>>>>> EC2 machines. So far, I haven't been able to get a cluster built. I can >>>>>> get >>>>>> a supervisor and nimbus to run on a single machine, but I can't figure >>>>>> out >>>>>> how to get another worker to connect to nimbus. How do worker machines >>>>>> "know" how to connect to nimbus? Is it in the storm configuration file? >>>>>> I've gone through many tutorials and the official documentation, but this >>>>>> point doesn't seem to be covered anywhere in sufficient detail for a new >>>>>> guy like me. >>>>>> >>>>>> Some of you may be tempted to point me toward storm-deploy, but I >>>>>> spent four days trying to get that to work until I gave up. I'm having >>>>>> Issue #58 on github. Following the instructions exactly and other >>>>>> tutorials >>>>>> on a bran new AWS machine fails. So I gave up on storm-deploy and decided >>>>>> to try and setup a cluster manually. Thanks in advance to anyone willing >>>>>> to >>>>>> offer me any inputs you can! >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> CONFIDENTIALITY NOTICE >>>>>> NOTICE: This message is intended for the use of the individual or >>>>>> entity to which it is addressed and may contain information that is >>>>>> confidential, privileged and exempt from disclosure under applicable law. >>>>>> If the reader of this message is not the intended recipient, you are >>>>>> hereby >>>>>> notified that any printing, copying, dissemination, distribution, >>>>>> disclosure or forwarding of this communication is strictly prohibited. If >>>>>> you have received this communication in error, please contact the sender >>>>>> immediately and delete it from your system. Thank You. >>>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> Regards, >> Vikas Agarwal >> 91 – 9928301411 >> >> InfoObjects, Inc. >> Execution Matters >> http://www.infoobjects.com >> 2041 Mission College Boulevard, #280 >> Santa Clara, CA 95054 >> +1 (408) 988-2000 Work >> +1 (408) 716-2726 Fax >> >> > -- Regards, Vikas Agarwal 91 – 9928301411 InfoObjects, Inc. Execution Matters http://www.infoobjects.com 2041 Mission College Boulevard, #280 Santa Clara, CA 95054 +1 (408) 988-2000 Work +1 (408) 716-2726 Fax