Hi Harsha, Thanks it worked! But I was wondering the reason why it was failing earlier even when I was initiating it as a background process something like,
bin/storm nimbus & and likewise. . . Thanks! On Mon, Feb 9, 2015 at 6:22 AM, Harsha <st...@harsha.io> wrote: > > Are you running nimbus, supervisors in background? looks like you are > sshing into machines and running ./bin/storm nimbus in foreground which > will get killed when you exit the ssh session. Make sure you use > supervisord http://supervisord.org/ to run nimbus, supervisors. > > On Sat, Feb 7, 2015, at 11:04 AM, Vineet Mishra wrote: > > > Including Subject! > > On Sun, Feb 8, 2015 at 12:33 AM, Vineet Mishra <clearmido...@gmail.com> > wrote: > > Hi All, > > I am running a Kafka Storm topology in distributed mode, its running good > for the initial run when I start the cluster(3 node cluster) deploy the > Storm Topology and leave it to run. > There are often times the whole cluster goes down(nimbus, supervisor, > workers) and this is most of the time happening when I submit the topology > to run and disconnect my session from the machine. > > I could fairly notice that on one of the worker node its throwing the > error : > > java.lang.RuntimeException: java.lang.RuntimeException: Client is being > closed, and does not take requests any more > > Config and Detailed Stack Trance is provided below. > > Node 1 - Nimbus, UI > Node 2 - Supervisor, Worker > Node 3 - Supervisor, Worker > > 2015-02-07T23:01:25.884+0530 b.s.d.worker [INFO] Shutting down worker > KafkaConsumerTopologyy-1-1423329275 9d98d0b4-1bb4-42e9-9a72-a67b82c64b2c > 6703 > 2015-02-07T23:01:25.884+0530 b.s.m.n.Client [INFO] Closing Netty Client > Netty-Client-ip-20-0-0-78/20.0.0.78:6703 > 2015-02-07T23:01:25.885+0530 b.s.m.n.Client [INFO] Waiting for pending > batchs to be sent with Netty-Client-ip-20-0-0-78/20.0.0.78:6703..., > timeout: 600000ms, pendings: 0 > 2015-02-07T23:01:25.886+0530 b.s.d.worker [INFO] Shutting down receive > thread > 2015-02-07T23:01:25.886+0530 o.a.s.c.r.ExponentialBackoffRetry [WARN] > maxRetries too large (300). Pinning to 29 > 2015-02-07T23:01:25.886+0530 b.s.u.StormBoundedExponentialBackoffRetry > [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries > [300] > 2015-02-07T23:01:25.887+0530 b.s.m.n.Client [INFO] New Netty Client, > connect to localhost, 6703, config: , buffer_size: 5242880 > 2015-02-07T23:01:25.887+0530 b.s.m.n.Client [INFO] Reconnect started for > Netty-Client-localhost/127.0.0.1:6703... [0] > 2015-02-07T23:01:25.887+0530 b.s.m.loader [INFO] Shutting down > receiving-thread: [KafkaConsumerTopologyy-1-1423329275, 6703] > 2015-02-07T23:01:25.893+0530 b.s.m.n.Client [INFO] connection established > to a remote host Netty-Client-localhost/127.0.0.1:6703, [id: 0x8f71aaa0, / > 127.0.0.1:59427 => localhost/127.0.0.1:6703] > 2015-02-07T23:01:25.893+0530 b.s.m.n.Client [INFO] Closing Netty Client > Netty-Client-localhost/127.0.0.1:6703 > 2015-02-07T23:01:25.893+0530 b.s.m.n.Client [INFO] Waiting for pending > batchs to be sent with Netty-Client-localhost/127.0.0.1:6703..., timeout: > 600000ms, pendings: 0 > 2015-02-07T23:01:25.894+0530 b.s.m.loader [INFO] Waiting for > receiving-thread:[KafkaConsumerTopologyy-1-1423329275, 6703] to die > 2015-02-07T23:01:25.895+0530 b.s.m.loader [INFO] Shutdown > receiving-thread: [KafkaConsumerTopologyy-1-1423329275, 6703] > 2015-02-07T23:01:25.895+0530 b.s.d.worker [INFO] Shut down receive thread > 2015-02-07T23:01:25.895+0530 b.s.d.worker [INFO] Terminating messaging > context > 2015-02-07T23:01:25.895+0530 b.s.d.worker [INFO] Shutting down executors > 2015-02-07T23:01:25.895+0530 b.s.d.executor [INFO] Shutting down executor > KafkaSpout:[3 3] > 2015-02-07T23:01:25.896+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.896+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.897+0530 b.s.util [ERROR] Async loop died! > java.lang.RuntimeException: java.lang.RuntimeException: Client is being > closed, and does not take requests any more > at > backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) > ~[storm-core-0.9.3.jar:0.9.3] > at > backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) > ~[storm-core-0.9.3.jar:0.9.3] > at > backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) > ~[storm-core-0.9.3.jar:0.9.3] > at > backtype.storm.disruptor$consume_loop_STAR_$fn__1460.invoke(disruptor.clj:94) > ~[storm-core-0.9.3.jar:0.9.3] > at backtype.storm.util$async_loop$fn__464.invoke(util.clj:463) > ~[storm-core-0.9.3.jar:0.9.3] > at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] > Caused by: java.lang.RuntimeException: Client is being closed, and does > not take requests any more > at backtype.storm.messaging.netty.Client.send(Client.java:185) > ~[storm-core-0.9.3.jar:0.9.3] > at backtype.storm.utils.TransferDrainer.send(TransferDrainer.java:54) > ~[storm-core-0.9.3.jar:0.9.3] > at > backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730$fn__3731.invoke(worker.clj:330) > ~[storm-core-0.9.3.jar:0.9.3] > at > backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730.invoke(worker.clj:328) > ~[storm-core-0.9.3.jar:0.9.3] > at > backtype.storm.disruptor$clojure_handler$reify__1447.onEvent(disruptor.clj:58) > ~[storm-core-0.9.3.jar:0.9.3] > at > backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) > ~[storm-core-0.9.3.jar:0.9.3] > ... 6 common frames omitted > 2015-02-07T23:01:25.900+0530 o.a.z.ZooKeeper [INFO] Session: > 0x14b58cc25de0115 closed > 2015-02-07T23:01:25.900+0530 o.a.z.ClientCnxn [INFO] EventThread shut down > 2015-02-07T23:01:25.900+0530 b.s.d.executor [INFO] Shut down executor > KafkaSpout:[3 3] > 2015-02-07T23:01:25.901+0530 b.s.d.executor [INFO] Shutting down executor > KafkaSpout:[5 5] > 2015-02-07T23:01:25.901+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.901+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.903+0530 o.a.z.ZooKeeper [INFO] Session: > 0x14b58cc25de0117 closed > 2015-02-07T23:01:25.903+0530 o.a.z.ClientCnxn [INFO] EventThread shut down > 2015-02-07T23:01:25.903+0530 b.s.d.executor [INFO] Shut down executor > KafkaSpout:[5 5] > 2015-02-07T23:01:25.903+0530 b.s.d.executor [INFO] Shutting down executor > KafkaSpout:[7 7] > 2015-02-07T23:01:25.904+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.904+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.905+0530 o.a.z.ZooKeeper [INFO] Session: > 0x14b58cc25de0114 closed > 2015-02-07T23:01:25.905+0530 o.a.z.ClientCnxn [INFO] EventThread shut down > 2015-02-07T23:01:25.906+0530 b.s.d.executor [INFO] Shut down executor > KafkaSpout:[7 7] > 2015-02-07T23:01:25.906+0530 b.s.d.executor [INFO] Shutting down executor > KafkaSpout:[9 9] > 2015-02-07T23:01:25.906+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.906+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.906+0530 b.s.util [ERROR] Halting process: ("Async > loop died!") > java.lang.RuntimeException: ("Async loop died!") > at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) > [storm-core-0.9.3.jar:0.9.3] > at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na] > at > backtype.storm.disruptor$consume_loop_STAR_$fn__1458.invoke(disruptor.clj:92) > [storm-core-0.9.3.jar:0.9.3] > at backtype.storm.util$async_loop$fn__464.invoke(util.clj:473) > [storm-core-0.9.3.jar:0.9.3] > at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] > 2015-02-07T23:01:25.908+0530 o.a.z.ZooKeeper [INFO] Session: > 0x14b58cc25de0116 closed > 2015-02-07T23:01:25.908+0530 o.a.z.ClientCnxn [INFO] EventThread shut down > 2015-02-07T23:01:25.908+0530 b.s.d.executor [INFO] Shut down executor > KafkaSpout:[9 9] > 2015-02-07T23:01:25.909+0530 b.s.d.executor [INFO] Shutting down executor > __acker:[11 11] > 2015-02-07T23:01:25.909+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.909+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.909+0530 b.s.d.executor [INFO] Shut down executor > __acker:[11 11] > 2015-02-07T23:01:25.909+0530 b.s.d.executor [INFO] Shutting down executor > __system:[-1 -1] > 2015-02-07T23:01:25.910+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.910+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.910+0530 b.s.d.executor [INFO] Shut down executor > __system:[-1 -1] > 2015-02-07T23:01:25.910+0530 b.s.d.executor [INFO] Shutting down executor > FileBolt:[1 1] > 2015-02-07T23:01:25.910+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.910+0530 b.s.util [INFO] Async loop interrupted! > 2015-02-07T23:01:25.911+0530 b.s.d.executor [INFO] Shut down executor > FileBolt:[1 1] > 2015-02-07T23:01:25.911+0530 b.s.d.worker [INFO] Shut down executors > 2015-02-07T23:01:25.916+0530 b.s.d.worker [INFO] Shutting down transfer > thread > > URGENT CALL! > > Thanks! > > > > >