Will try, thank you Harsha
2014-11-26 16:31 GMT+02:00 Harsha <[email protected]>: > This could be due to your storm.local.dir getting corrupted. You can > delete the contents of this dir and restart the storm cluster (nimbus, > supervisor). > > > On Wed, Nov 26, 2014, at 01:51 AM, Dimitris Samaras wrote: > > Hi all, > > @Harsha, by : > > "Everything works fine up with topologies etc, to the point that the > Storm cluster needs to be restarted. > In that case for storm.sh (nimbus, super ,ui) to run successfully on a > node Storm has to be redeployed on that node and reconfigured(storm.yaml)." > > i mean that i can deploy a fully functional cluster and run/test the > topologies properly, everything ok on runtime. > If the node gets restarted (it runs on VM) due to host pc restart etc., > when i execute "storm supervisor" for example on a supervisor node to > restart it, it does not start! > > @Samit, the supervisor.log is: > > 2014-11-26 11:26:16 b.s.d.supervisor [INFO] Starting supervisor with id > ea561988-508d-4593-9873-00f15736a6bf at host Ubuntu14super1 > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:host.name > =Ubuntu14super1 > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:java.version=1.7.0_72 > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:java.vendor=Oracle Corporation > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:java.home=/usr/lib/jvm/java-7-oracle/jre > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:java.class.path=/usr/local/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/local/storm/lib/logback-classic-1.0.6.jar:/usr/local/storm/lib/chill-java-0.3.5.jar:/usr/local/storm/lib/compojure-1.1.3.jar:/usr/local/sto$ > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:java.io.tmpdir=/tmp > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:java.compiler=<NA> > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:os.name > =Linux > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:os.arch=amd64 > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:os.version=3.13.0-40-generic > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:user.name > =dimsam > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:user.home=/home/dimsam > 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client > environment:user.dir=/usr/local/storm/bin > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server environment: > host.name=Ubuntu14super1 > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:java.version=1.7.0_72 > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:java.vendor=Oracle Corporation > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:java.home=/usr/lib/jvm/java-7-oracle/jre > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:java.class.path=/usr/local/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/local/storm/lib/logback-classic-1.0.6.jar:/usr/local/storm/lib/chill-java-0.3.5.jar:/usr/local/storm/lib/compojure-1.1.3.jar:/usr/l$ > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:java.io.tmpdir=/tmp > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:java.compiler=<NA> > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server environment: > os.name=Linux > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:os.arch=amd64 > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:os.version=3.13.0-40-generic > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server environment: > user.name=dimsam > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:user.home=/home/dimsam > 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server > environment:user.dir=/usr/local/storm/bin > 2014-11-26 11:35:33 b.s.d.supervisor [INFO] Starting Supervisor with conf > {"dev.zookeeper.path" "/tmp/dev-storm-zookeeper", > "topology.tick.tuple.freq.secs" nil, > "topology.builtin.metrics.bucket.size.secs" 60, > "topology.fall.back.on.java.serialization" true, "topology.ma$ > 2014-11-26 11:35:34 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting > 2014-11-26 11:35:34 o.a.z.ZooKeeper [INFO] Initiating client connection, > connectString=195.251.117.209:2181 sessionTimeout=20000 > watcher=org.apache.curator.ConnectionState@4dddb4e > 2014-11-26 11:35:34 o.a.z.ClientCnxn [INFO] Opening socket connection to > server themis.iti.gr/195.251.117.209:2181. Will not attempt to > authenticate using SASL (unknown error) > 2014-11-26 11:35:34 o.a.z.ClientCnxn [INFO] Socket connection established > to themis.iti.gr/195.251.117.209:2181, initiating session > 2014-11-26 11:35:34 o.a.z.ClientCnxn [INFO] Session establishment complete > on server themis.iti.gr/195.251.117.209:2181, sessionid = > 0x149eb6ae8d10006, negotiated timeout = 20000 > 2014-11-26 11:35:34 o.a.c.f.s.ConnectionStateManager [INFO] State change: > CONNECTED > 2014-11-26 11:35:34 o.a.c.f.s.ConnectionStateManager [WARN] There are no > ConnectionStateListeners registered. > 2014-11-26 11:35:34 b.s.zookeeper [INFO] Zookeeper state update: > :connected:none > 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] EventThread shut down > 2014-11-26 11:35:35 o.a.z.ZooKeeper [INFO] Session: 0x149eb6ae8d10006 > closed > 2014-11-26 11:35:35 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting > 2014-11-26 11:35:35 o.a.z.ZooKeeper [INFO] Initiating client connection, > connectString=195.251.117.209:2181/storm sessionTimeout=20000 > watcher=org.apache.curator.ConnectionState@4e451d76 > 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] Opening socket connection to > server themis.iti.gr/195.251.117.209:2181. Will not attempt to > authenticate using SASL (unknown error) > 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] Socket connection established > to themis.iti.gr/195.251.117.209:2181, initiating session > 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] Session establishment complete > on server themis.iti.gr/195.251.117.209:2181, sessionid = > 0x149eb6ae8d10007, negotiated timeout = 20000 > 2014-11-26 11:35:35 o.a.c.f.s.ConnectionStateManager [INFO] State change: > CONNECTED > 2014-11-26 11:35:35 o.a.c.f.s.ConnectionStateManager [WARN] There are no > ConnectionStateListeners registered. > 2014-11-26 11:35:35 b.s.d.supervisor [INFO] Starting supervisor with id > ea561988-508d-4593-9873-00f15736a6bf at host Ubuntu14super1 > 2014-11-26 11:35:36 b.s.event [ERROR] Error when processing event > java.lang.RuntimeException: java.io.EOFException > at backtype.storm.utils.Utils.deserialize(Utils.java:93) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at backtype.storm.utils.LocalState.snapshot(LocalState.java:45) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at backtype.storm.utils.LocalState.get(LocalState.java:56) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at > backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:207) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at clojure.lang.AFn.applyToHelper(AFn.java:161) > [clojure-1.5.1.jar:na] > at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na] > at clojure.core$apply.invoke(core.clj:619) ~[clojure-1.5.1.jar:na] > at clojure.core$partial$fn__4190.doInvoke(core.clj:2396) > ~[clojure-1.5.1.jar:na] > at clojure.lang.RestFn.invoke(RestFn.java:397) > ~[clojure-1.5.1.jar:na] > at > backtype.storm.event$event_manager$fn__2378.invoke(event.clj:39) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72] > Caused by: java.io.EOFException: null > at > java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325) > ~[na:1.7.0_72] > at > java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794) > ~[na:1.7.0_72] > at > java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) > ~[na:1.7.0_72] > at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) > ~[na:1.7.0_72] > at backtype.storm.utils.Utils.deserialize(Utils.java:88) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > ... 11 common frames omitted > 2014-11-26 11:35:36 b.s.event [ERROR] Error when processing event > java.lang.RuntimeException: java.io.EOFException > at backtype.storm.utils.Utils.deserialize(Utils.java:93) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at backtype.storm.utils.LocalState.snapshot(LocalState.java:45) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at backtype.storm.utils.LocalState.get(LocalState.java:56) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at > backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__6330.invoke(supervisor.clj:307) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at > backtype.storm.event$event_manager$fn__2378.invoke(event.clj:39) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72] > Caused by: java.io.EOFException: null > at > java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325) > ~[na:1.7.0_72] > at > java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794) > ~[na:1.7.0_72] > at > java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) > ~[na:1.7.0_72] > at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) > ~[na:1.7.0_72] > at backtype.storm.utils.Utils.deserialize(Utils.java:88) > ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] > ... 6 common frames omitted > 2014-11-26 11:35:36 b.s.util [INFO] Halting process: ("Error when > processing an event") > > > The first line is from when the strom supervisor was running properly! > After a node restart the supervisor will not start and i get the rest of > the log.... > > > by: "to run successfully on a node, Storm has to be redeployed on that > node and reconfigured(storm.yaml)." > i mean that in order to run the supervisor/nimbus again i have to > redeploy Storm on every node that fails to start! I do not change the > config on storm.yaml, simply have to rewrite it with the same values. > > > Thanks again! > > 2014-11-25 17:53 GMT+02:00 Harsha <[email protected]>: > > > > Dimitris, > can you give more details on this " > Everything works fine up with topologies etc, to the point that the Storm > cluster needs to be restarted. > In that case for storm.sh (nimbus, super ,ui) to run successfully on a > node Storm has to be redeployed on that node and reconfigured(storm.yaml)." > > > Is the cluster going down when you deploy a topology? > "to run successfully on a node Storm has to be redeployed on that node > and reconfigured(storm.yaml)." > > what you mean by reconfiguration do you change the storm.yaml values > from previous deployment. > > -Harsha > > > On Tue, Nov 25, 2014, at 06:24 AM, Samit Sasan wrote: > > can you share the logs > > -Samit > > On Tue, Nov 25, 2014 at 6:12 PM, Dimitris Samaras < > [email protected]> wrote: > > Hi all, > > We are currently testing Storm framework with 4 VM nodes (1 nimbus , 3 > supervisors) and a single node zookeeper cluster for the Storm cluster > management. > Everything works fine up with topologies etc, to the point that the Storm > cluster needs to be restarted. > In that case for storm.sh (nimbus, super ,ui) to run successfully on a > node Storm has to be redeployed on that node and reconfigured(storm.yaml). > > Any thoughts? > Thanks in advance, > Dimitris > > > > > > > > >
