ryan.jin created STORM-2596:
-------------------------------

             Summary: Storm Worker not reconnect the Netty Client
                 Key: STORM-2596
                 URL: https://issues.apache.org/jira/browse/STORM-2596
             Project: Apache Storm
          Issue Type: Bug
          Components: storm-core
    Affects Versions: 1.1.0
            Reporter: ryan.jin
            Priority: Critical


I have report the simliar bugs at 
[STORM-2561|https://issues.apache.org/jira/browse/STORM-2561] on the version of 
0.10.1.

And these days I upgrade the storm to 1.1.0, but today the bug is appeared 
agagin.

The worker.log shows
{code:java}
$ cat worker.log|grep '10.24.40.254:6812'|more
2017-06-22 15:14:25.295 o.a.s.m.n.Client main [INFO] creating Netty Client, 
connecting to 10.24.40.254:6812, bufferSize: 5242880
2017-06-23 11:23:32.570 o.a.s.m.n.StormClientHandler client-worker-1 [INFO] 
Connection to /10.24.40.254:6812 failed:
2017-06-23 11:23:35.654 o.a.s.m.n.Client refresh-connections-timer [INFO] 
closing Netty Client Netty-Client-/10.24.40.254:6812
2017-06-23 11:23:35.655 o.a.s.m.n.Client refresh-connections-timer [INFO] 
waiting up to 600000 ms to send 0 pending messages to Netty-Client-/10.24.40.254
:6812
2017-06-23 14:57:03.352 o.a.s.m.n.Client 
Thread-10-disruptor-worker-transfer-queue [ERROR] discarding 1 messages because 
the Netty client to Netty-Client-
/10.24.40.254:6812 is being closed
2017-06-23 14:57:59.777 o.a.s.m.n.Client 
Thread-10-disruptor-worker-transfer-queue [ERROR] discarding 1 messages because 
the Netty client to Netty-Client-
/10.24.40.254:6812 is being closed
2017-06-23 14:59:16.038 o.a.s.m.n.Client 
Thread-10-disruptor-worker-transfer-queue [ERROR] discarding 1 messages because 
the Netty client to Netty-Client-
/10.24.40.254:6812 is being closed
2017-06-23 15:01:27.092 o.a.s.m.n.Client 
Thread-10-disruptor-worker-transfer-queue [ERROR] discarding 1 messages because 
the Netty client to Netty-Client-
/10.24.40.254:6812 is being closed
2017-06-23 15:04:08.654 o.a.s.m.n.Client 
Thread-10-disruptor-worker-transfer-queue [ERROR] discarding 1 messages because 
the Netty client to Netty-Client-
/10.24.40.254:6812 is being closed
2017-06-23 15:06:59.777 o.a.s.m.n.Client 
Thread-10-disruptor-worker-transfer-queue [ERROR] discarding 1 messages because 
the Netty client to Netty-Client-
/10.24.40.254:6812 is being closed
{code}
The worker close the netty client on 2017-06-23 11:23:35.654, and never start 
the netty client. So the messages later on that worker are been discarded.
 
On that time Storm Node(10.24.40.254:6812) is OOM.

{code:java}
2017-06-23 11:22:59.623 g.a.s.s.t.SolrPersistApi pool-10-thread-8 [INFO] write 
200 doc at:invoketrace success cost 228060
2017-06-23 11:22:59.625 g.a.s.s.t.SolrPersistApi pool-10-thread-5 [INFO] write 
66 doc at:invoketrace success cost 226739
2017-06-23 11:22:59.626 g.a.s.s.t.SolrPersistApi pool-10-thread-7 [INFO] write 
200 doc at:invoketrace success cost 167869
2017-06-23 11:23:32.242 STDERR Thread-2 [INFO] java.lang.OutOfMemoryError: Java 
heap space
2017-06-23 11:23:32.253 STDERR Thread-2 [INFO] Dumping heap to 
artifacts/heapdump ...
@
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to