I have resolved this problem several times.

There are two root cause.
(1) local temporary network ports are conflict with storm's ports.
(2) the old worker failed to be killed when kill topology.

Firstly, please do "ps -ef|grep 67xx" to check whether it is due to the
second problem. if there is no worker using the port. it would be temporary
port conflict.
please do the following action with root privilege

echo 'net.ipv4.ip_local_port_range = 10240 65535' >> /etc/sysctl.conf
/sbin/sysctl -p

why occur this, due to in every network connection,  client will bind one
temporary port, if the linux default temporary range is from 1024 to 65535,
so it is still likely to conflict with storm's port.


if there are some alive worker using the port, it is due to storm fail to
kill worker when kill topology, when the OS's cpu usage is pretty high,
this phenomenon will occur, but generally this case is pretty rare. By the
way, in the jstorm (https://github.com/alibaba/jstorm), it resolves this
issue. Fortunately, jstorm has been donated to storm, you will see  the
solution in a short time.


By the way, I found you are using an old version storm, why don't try the
storm 0.9.x, whose performance has been improved much.








2015-11-01 17:26 GMT+08:00 researcher cs <prog.researc...@gmail.com>:

> after submitting topology
>
> supervisor log file
>
> 2015-11-01 09:59:48 executor [INFO] Loading executor b-1:[3 3] 2015-11-01
> 09:59:50 executor [INFO] Loaded executor tasks b-1:[3 3][INFO] Launching
> worker with assignment
> #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id
> "df-1-1446364738", :executors ([2 2] [35 35] [5 5] [38 38] [8 8] [41 41]
> [11 11] [44 44] [14 14] [47 47] [17 17] [50 50] [20 20] [53 53] [23 23] [56
> 56] [26 26] [29 29] [32 32])} for this supervisor
> fdd1e16a-650e-4d12-90f4-cb87336f29c3 on port 6702 with id
> eb224d37-bc81-43ec-bbb9-8e5897c203fa
> after
> 2015-11-01 09:59:02 supervisor [INFO] 0af4b32b-e14d-4f64-ba03-d61d79fa6405
> still hasn't started 2015-11-01 09:59:03 supervisor [INFO]
> 0af4b32b-e14d-4f64-ba03-d61d79fa6405 still hasn't started
> !!!!!!
>
> but worker has data like
>
> 2015-11-01 09:59:48 executor [INFO] Loading executor b-1:[3 3] 2015-11-01
> 09:59:50 executor [INFO] Loaded executor tasks b-1:[3 3]
>
> then i executed the command which launch worker with supervisor to know
> where is the error exactly  found this
> in the worker log file
> [ERROR] Async loop died! org.zeromq.ZMQException: Address already in
> use(0x62) at org.zeromq.ZMQ$Socket.bind(Native Method) at 
> zilch.mq$bind.invoke(mq.clj:69)
> at backtype.storm.messaging.zmq.ZMQContext.bind(zmq.clj:57) at
> backtype.storm.messaging.loader$launch_receive_thread_BANG_$fn__1629.invoke(loader.clj:26)
> at backtype.storm.util$async_loop$fn__465.invoke(util.clj:375) at
> clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Unknown Source)
> 2015-11-01 10:03:19 util [INFO] Halting process
>
>
> i'm on this error more than 3 weeks !! hope really can find someone help
>

Reply via email to