Hello,
I updated our cluster and topologies to storm 1.0.3 (from 0.9.6 or so).
Now, when one of bolt worker executors encounters an uncaught exception (DB
write timeout)

2017-02-22 09:49:02.873 o.a.s.d.executor Thread-11-bolt-executor[9 9]
[ERROR]
java.lang.RuntimeException:
redis.clients.jedis.exceptions.JedisConnectionException:
java.net.SocketTimeoutException: Read timed out
        at
org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:466)
~[storm-core-1.0.3.jar:1.0.3]
        at
org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:432)
~[storm-core-1.0.3.jar:1.0.3]
        at
org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73)
~[storm-core-1.0.3.jar:1.0.3]
        at
org.apache.storm.daemon.executor$fn__4973$fn__4986$fn__5039.invoke(executor.clj:846)
~[storm-core-1.0.3.jar:1.0.3]
        at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:484)
[storm-core-1.0.3.jar:1.0.3]
        at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
Caused by: redis.clients.jedis.exceptions.JedisConnectionException:
java.net.SocketTimeoutException: Read timed out
        at
redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:201)
~[stormjar.jar:?]
        ....


Then, the worker only logs

2017-02-22 09:49:02.897 o.a.s.d.executor Thread-11-bolt-executor[9 9]
[INFO] Got interrupted excpetion shutting thread down...

and does nothing. There is nothing in the supervisor or nimbus logs.

Can anyone please help me out? What should I do for the worker/executor to
restart?

(PS: It is perhaps related to
https://issues.apache.org/jira/browse/STORM-2194)

Thanks,
Martin

Reply via email to