Are you running nimbus, supervisors in background? looks like you are
sshing into machines and running ./bin/storm nimbus in foreground which
will get killed when you exit the ssh session. Make sure you use
supervisord http://supervisord.org/ to run nimbus, supervisors.

On Sat, Feb 7, 2015, at 11:04 AM, Vineet Mishra wrote:
>
> Including Subject!
>
> On Sun, Feb 8, 2015 at 12:33 AM, Vineet Mishra
> <clearmido...@gmail.com> wrote:
>> Hi All,
>>
>> I am running a Kafka Storm topology in distributed mode, its running
>> good for the initial run when I start the cluster(3 node cluster)
>> deploy the Storm Topology and leave it to run. There are often times
>> the whole cluster goes down(nimbus, supervisor, workers) and this is
>> most of the time happening when I submit the topology to run and
>> disconnect my session from the machine.
>>
>> I could fairly notice that on one of the worker node its throwing the
>> error :
>>
>> java.lang.RuntimeException: java.lang.RuntimeException: Client is
>> being closed, and does not take requests any more
>>
>> Config and Detailed Stack Trance is provided below.
>>
>> Node 1 - Nimbus, UI Node 2 - Supervisor, Worker Node 3 -
>> Supervisor, Worker
>>
>> 2015-02-07T23:01:25.884+0530 b.s.d.worker [INFO] Shutting down worker
>> KafkaConsumerTopologyy-1-1423329275
>> 9d98d0b4-1bb4-42e9-9a72-a67b82c64b2c 6703
>> 2015-02-07T23:01:25.884+0530 b.s.m.n.Client [INFO] Closing Netty
>> Client Netty-Client-ip-20-0-0-78/20.0.0.78:6703
>> 2015-02-07T23:01:25.885+0530 b.s.m.n.Client [INFO] Waiting for
>> pending batchs to be sent with
>> Netty-Client-ip-20-0-0-78/20.0.0.78:6703..., timeout: 600000ms,
>> pendings: 0 2015-02-07T23:01:25.886+0530 b.s.d.worker [INFO] Shutting
>> down receive thread 2015-02-07T23:01:25.886+0530
>> o.a.s.c.r.ExponentialBackoffRetry [WARN] maxRetries too large (300).
>> Pinning to 29 2015-02-07T23:01:25.886+0530
>> b.s.u.StormBoundedExponentialBackoffRetry [INFO] The baseSleepTimeMs
>> [100] the maxSleepTimeMs [1000] the maxRetries [300]
>> 2015-02-07T23:01:25.887+0530 b.s.m.n.Client [INFO] New Netty Client,
>> connect to localhost, 6703, config: , buffer_size: 5242880
>> 2015-02-07T23:01:25.887+0530 b.s.m.n.Client [INFO] Reconnect started
>> for Netty-Client-localhost/127.0.0.1:6703... [0]
>> 2015-02-07T23:01:25.887+0530 b.s.m.loader [INFO] Shutting down
>> receiving-thread: [KafkaConsumerTopologyy-1-1423329275, 6703]
>> 2015-02-07T23:01:25.893+0530 b.s.m.n.Client [INFO] connection
>> established to a remote host Netty-Client-localhost/127.0.0.1:6703,
>> [id: 0x8f71aaa0, /127.0.0.1:59427 => localhost/127.0.0.1:6703]
>> 2015-02-07T23:01:25.893+0530 b.s.m.n.Client [INFO] Closing Netty
>> Client Netty-Client-localhost/127.0.0.1:6703
>> 2015-02-07T23:01:25.893+0530 b.s.m.n.Client [INFO] Waiting for
>> pending batchs to be sent with
>> Netty-Client-localhost/127.0.0.1:6703..., timeout: 600000ms,
>> pendings: 0 2015-02-07T23:01:25.894+0530 b.s.m.loader [INFO] Waiting
>> for receiving-thread:[KafkaConsumerTopologyy-1-1423329275, 6703] to
>> die 2015-02-07T23:01:25.895+0530 b.s.m.loader [INFO] Shutdown
>> receiving-thread: [KafkaConsumerTopologyy-1-1423329275, 6703]
>> 2015-02-07T23:01:25.895+0530 b.s.d.worker [INFO] Shut down receive
>> thread 2015-02-07T23:01:25.895+0530 b.s.d.worker [INFO] Terminating
>> messaging context 2015-02-07T23:01:25.895+0530 b.s.d.worker [INFO]
>> Shutting down executors 2015-02-07T23:01:25.895+0530 b.s.d.executor
>> [INFO] Shutting down executor KafkaSpout:[3 3]
>> 2015-02-07T23:01:25.896+0530 b.s.util [INFO] Async loop interrupted!
>> 2015-02-07T23:01:25.896+0530 b.s.util [INFO] Async loop interrupted!
>> 2015-02-07T23:01:25.897+0530 b.s.util [ERROR] Async loop died!
>> java.lang.RuntimeException: java.lang.RuntimeException: Client is
>> being closed, and does not take requests any more at
>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.disruptor$consume_loop_STAR_$fn__1460.invoke(disruptor.clj:94)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.util$async_loop$fn__464.invoke(util.clj:463)
>> ~[storm-core-0.9.3.jar:0.9.3] at clojure.lang.AFn.run(AFn.java:24)
>> [clojure-1.5.1.jar:na] at java.lang.Thread.run(Thread.java:745)
>> [na:1.7.0_75] Caused by: java.lang.RuntimeException: Client is being
>> closed, and does not take requests any more at
>> backtype.storm.messaging.netty.Client.send(Client.java:185)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.utils.TransferDrainer.send(TransferDrainer.java:54)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730$fn__3731.invoke(worker.clj:330)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730.invoke(worker.clj:328)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.disruptor$clojure_handler$reify__1447.onEvent(disruptor.clj:58)
>> ~[storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125)
>> ~[storm-core-0.9.3.jar:0.9.3] ... 6 common frames omitted
>> 2015-02-07T23:01:25.900+0530 o.a.z.ZooKeeper [INFO] Session:
>> 0x14b58cc25de0115 closed 2015-02-07T23:01:25.900+0530
>> o.a.z.ClientCnxn [INFO] EventThread shut down
>> 2015-02-07T23:01:25.900+0530 b.s.d.executor [INFO] Shut down executor
>> KafkaSpout:[3 3] 2015-02-07T23:01:25.901+0530 b.s.d.executor [INFO]
>> Shutting down executor KafkaSpout:[5 5] 2015-02-07T23:01:25.901+0530
>> b.s.util [INFO] Async loop interrupted! 2015-02-07T23:01:25.901+0530
>> b.s.util [INFO] Async loop interrupted! 2015-02-07T23:01:25.903+0530
>> o.a.z.ZooKeeper [INFO] Session: 0x14b58cc25de0117 closed
>> 2015-02-07T23:01:25.903+0530 o.a.z.ClientCnxn [INFO] EventThread shut
>> down 2015-02-07T23:01:25.903+0530 b.s.d.executor [INFO] Shut down
>> executor KafkaSpout:[5 5] 2015-02-07T23:01:25.903+0530 b.s.d.executor
>> [INFO] Shutting down executor KafkaSpout:[7 7]
>> 2015-02-07T23:01:25.904+0530 b.s.util [INFO] Async loop interrupted!
>> 2015-02-07T23:01:25.904+0530 b.s.util [INFO] Async loop interrupted!
>> 2015-02-07T23:01:25.905+0530 o.a.z.ZooKeeper [INFO] Session:
>> 0x14b58cc25de0114 closed 2015-02-07T23:01:25.905+0530
>> o.a.z.ClientCnxn [INFO] EventThread shut down
>> 2015-02-07T23:01:25.906+0530 b.s.d.executor [INFO] Shut down executor
>> KafkaSpout:[7 7] 2015-02-07T23:01:25.906+0530 b.s.d.executor [INFO]
>> Shutting down executor KafkaSpout:[9 9] 2015-02-07T23:01:25.906+0530
>> b.s.util [INFO] Async loop interrupted! 2015-02-07T23:01:25.906+0530
>> b.s.util [INFO] Async loop interrupted! 2015-02-07T23:01:25.906+0530
>> b.s.util [ERROR] Halting process: ("Async loop died!")
>> java.lang.RuntimeException: ("Async loop died!") at
>> backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325)
>> [storm-core-0.9.3.jar:0.9.3] at
>> clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na] at
>> backtype.storm.disruptor$consume_loop_STAR_$fn__1458.invoke(disruptor.clj:92)
>> [storm-core-0.9.3.jar:0.9.3] at
>> backtype.storm.util$async_loop$fn__464.invoke(util.clj:473)
>> [storm-core-0.9.3.jar:0.9.3] at clojure.lang.AFn.run(AFn.java:24)
>> [clojure-1.5.1.jar:na] at java.lang.Thread.run(Thread.java:745)
>> [na:1.7.0_75] 2015-02-07T23:01:25.908+0530 o.a.z.ZooKeeper [INFO]
>> Session: 0x14b58cc25de0116 closed 2015-02-07T23:01:25.908+0530
>> o.a.z.ClientCnxn [INFO] EventThread shut down
>> 2015-02-07T23:01:25.908+0530 b.s.d.executor [INFO] Shut down executor
>> KafkaSpout:[9 9] 2015-02-07T23:01:25.909+0530 b.s.d.executor [INFO]
>> Shutting down executor __acker:[11 11] 2015-02-07T23:01:25.909+0530
>> b.s.util [INFO] Async loop interrupted! 2015-02-07T23:01:25.909+0530
>> b.s.util [INFO] Async loop interrupted! 2015-02-07T23:01:25.909+0530
>> b.s.d.executor [INFO] Shut down executor __acker:[11 11]
>> 2015-02-07T23:01:25.909+0530 b.s.d.executor [INFO] Shutting down
>> executor __system:[-1 -1] 2015-02-07T23:01:25.910+0530 b.s.util
>> [INFO] Async loop interrupted! 2015-02-07T23:01:25.910+0530 b.s.util
>> [INFO] Async loop interrupted! 2015-02-07T23:01:25.910+0530
>> b.s.d.executor [INFO] Shut down executor __system:[-1 -1]
>> 2015-02-07T23:01:25.910+0530 b.s.d.executor [INFO] Shutting down
>> executor FileBolt:[1 1] 2015-02-07T23:01:25.910+0530 b.s.util [INFO]
>> Async loop interrupted! 2015-02-07T23:01:25.910+0530 b.s.util [INFO]
>> Async loop interrupted! 2015-02-07T23:01:25.911+0530 b.s.d.executor
>> [INFO] Shut down executor FileBolt:[1 1] 2015-02-07T23:01:25.911+0530
>> b.s.d.worker [INFO] Shut down executors 2015-02-07T23:01:25.916+0530
>> b.s.d.worker [INFO] Shutting down transfer thread
>>
>> URGENT CALL!
>>
>> Thanks!
>

Reply via email to