For anyone who is interested in this, we finally consider this is probably a 
non stable use of server memory issue,
Using -server -xms2G -xmx2G has successfully help us get rid of this issue.

Regards,
Yiming
_____________________________________________
From: Fang, Yiming [ICG-IT]
Sent: Monday, October 20, 2014 11:51 AM
To: [email protected]
Subject: RE: nimbus detect executor not alive and reassign tasks issue


Thanks Harsha for the help.

Yes, every tuple is acked once its successfully written to the DB.

Below attached nimbus and worker log for a restart scenario triggered recently:

Nimbus log:

2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[2 2] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[3 3] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[4 4] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[5 5] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[6 6] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[7 7] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[8 8] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[9 9] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[10 10] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[27 27] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[92 92] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[93 93] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[1 1] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[36 39] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[68 71] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[40 43] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[72 75] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[11 14] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[44 47] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[76 79] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[15 18] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[48 51] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[80 83] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[19 22] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[52 55] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[84 87] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[23 26] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[56 59] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[88 91] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[28 31] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[60 63] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[32 35] not alive
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Executor 
csl-realtime-tp-ap-equity-topology-1-1413267046:[64 67] not alive
2014-10-19 23:01:37 b.s.s.EvenScheduler [INFO] Available slots: 
(["e18633fb-6613-4afc-bea1-9941084508c5" 6702] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6701] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6700])
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Reassigning 
csl-realtime-tp-ap-equity-topology-1-1413267046 to 1 slots
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Reassign executors: [[2 2] [3 3] [4 4] 
[5 5] [6 6] [7 7] [8 8] [9 9] [10 10] [27 27] [92 92] [93 93] [1 1] [36 39] [68 
71] [40 43] [72 75] [11 14] [44 47] [76 79] [15 18] [48 51] [80 83] [19 22] [52 
55] [84 87] [23 26] [56 59] [88 91] [28 31] [60 63] [32 35] [64 67]]
2014-10-19 23:01:37 b.s.d.nimbus [INFO] Setting new assignment for topology id 
csl-realtime-tp-ap-equity-topology-1-1413267046: 
#backtype.storm.daemon.common.Assignment{:master-code-dir 
"/data/PBData/foundation/csl_realtime/data/storm/0.9.2/nimbus/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046",
 :node->host {"e18633fb-6613-4afc-bea1-9941084508c5" "pfdwlnx1u"}, 
:executor->node+port {[2 2] ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [3 
3] ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [4 4] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [5 5] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [6 6] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [7 7] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [8 8] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [9 9] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [10 10] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [27 27] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [92 92] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [93 93] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [1 1] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [36 39] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [68 71] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [40 43] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [72 75] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [11 14] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [44 47] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [76 79] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [15 18] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [48 51] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [80 83] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [19 22] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [52 55] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [84 87] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [23 26] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [56 59] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [88 91] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [28 31] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [60 63] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [32 35] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [64 67] 
["e18633fb-6613-4afc-bea1-9941084508c5" 6702]}, :executor->start-time-secs {[2 
2] 1413774097, [3 3] 1413774097, [4 4] 1413774097, [5 5] 1413774097, [6 6] 
1413774097, [7 7] 1413774097, [8 8] 1413774097, [9 9] 1413774097, [10 10] 
1413774097, [27 27] 1413774097, [92 92] 1413774097, [93 93] 1413774097, [1 1] 
1413774097, [36 39] 1413774097, [68 71] 1413774097, [40 43] 1413774097, [72 75] 
1413774097, [11 14] 1413774097, [44 47] 1413774097, [76 79] 1413774097, [15 18] 
1413774097, [48 51] 1413774097, [80 83] 1413774097, [19 22] 1413774097, [52 55] 
1413774097, [84 87] 1413774097, [23 26] 1413774097, [56 59] 1413774097, [88 91] 
1413774097, [28 31] 1413774097, [60 63] 1413774097, [32 35] 1413774097, [64 67] 
1413774097}}

Worker log 6702:


2014-10-19 22:59:33 s.k.KafkaUtils [INFO] Task [1/1] assigned [Partition{host= 
serverName:9092, partition=0}]
2014-10-19 22:59:33 s.k.ZkCoordinator [INFO] Task [1/1] Deleted partition 
managers: []
2014-10-19 22:59:33 s.k.ZkCoordinator [INFO] Task [1/1] New partition managers: 
[]
2014-10-19 22:59:33 s.k.ZkCoordinator [INFO] Task [1/1] Finished refreshing
2014-10-19 23:00:33 s.k.ZkCoordinator [INFO] Task [1/1] Refreshing partition 
manager connections
2014-10-19 23:00:33 s.k.DynamicBrokersReader [INFO] Read partition info from 
zookeeper: GlobalPartitionInformation{partitionMap={0=serverName:9092}}
2014-10-19 23:00:33 s.k.KafkaUtils [INFO] Task [1/1] assigned [Partition{host= 
serverName:9092, partition=0}]
2014-10-19 23:00:33 s.k.ZkCoordinator [INFO] Task [1/1] Deleted partition 
managers: []
2014-10-19 23:00:33 s.k.ZkCoordinator [INFO] Task [1/1] New partition managers: 
[]
2014-10-19 23:00:33 s.k.ZkCoordinator [INFO] Task [1/1] Finished refreshing

At 20:54 the last log for worker 6702 then suddenly at 23:01 we get the 
reassign tasks logics, seems nothing as exception happened. From the upper 
worker-6703 log, the worker 6703 finishes the last refresh of
Kafka spout 23:00:33 then seems no more active trace for the 6703 port, nimbus 
begin to reassign tasks on 6702:

2014-10-19 20:54:21 b.s.m.TransportFactory [INFO] Storm peer transport 
plugin:backtype.storm.messaging.netty.Context2014-10-19 23:01:41 
o.a.z.ZooKeeper [INFO] Client environment:zookeeper.version=3.4.5-1392090, 
built on 09/30/2012 17:52 GMT2014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] Client 
environment:host.name=serverName2014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] 
Client environment:java.version=1.7.0_512014-10-19 23:01:41 o.a.z.ZooKeeper 
[INFO] Client environment:java.vendor=Oracle Corporation2014-10-19 23:01:41 
o.a.z.ZooKeeper [INFO] Client 
environment:java.home=/opt/gpf/realtime/jdk/1.7.0_51l64/jre2014-10-19 23:01:41 
o.a.z.ZooKeeper [INFO] Client 
environment:java.class.path=/opt/gpf/realtime/storm/0.9.2/lib/logback-core-1.0.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/minlog-1.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/logback-classic-1.0.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/slf4j-api-1.6.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/jetty-6.1.26.jar:/opt/gpf/realtime/storm/0.9.2/lib/log4j-over-slf4j-1.6.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/clojure-1.5.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-exec-1.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/asm-4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-io-2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-lang-2.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-codec-1.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/netty-3.2.2.Final.jar:/opt/gpf/realtime/storm/0.9.2/lib/json-simple-1.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/core.incubator-0.1.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/hiccup-0.3.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/httpcore-4.3.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/jgrapht-core-0.9.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/reflectasm-1.07-shaded.jar:/opt/gpf/realtime/storm/0.9.2/lib/guava-13.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/servlet-api-2.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/clj-time-0.4.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/storm-core-0.9.2-incubating.jar:/opt/gpf/realtime/storm/0.9.2/lib/jline-2.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/disruptor-2.10.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/objenesis-1.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/kryo-2.21.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.logging-0.2.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/compojure-1.1.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-jetty-adapter-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/curator-framework-2.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/clj-stacktrace-0.2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-servlet-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/httpclient-4.3.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-devel-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/snakeyaml-1.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/carbonite-1.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/math.numeric-tower-0.0.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/netty-3.6.3.Final.jar:/opt/gpf/realtime/storm/0.9.2/lib/zookeeper-3.4.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-core-1.1.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.macro-0.1.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/clout-1.0.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.cli-0.2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-logging-1.1.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/chill-java-0.3.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-fileupload-1.2.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/curator-client-2.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/jetty-util-6.1.26.jar:/opt/gpf/realtime/storm/0.9.2/lib/joda-time-2.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/servlet-api-2.5-20081211.jar:/opt/gpf/realtime/storm/0.9.2/conf:/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/stormjar.jar2014-10-19
 23:01:41 o.a.z.ZooKeeper [INFO] Client 
environment:java.library.path=/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/resources/Linux-amd64:/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/resources:/usr/local/lib:/opt/local/lib:/usr/lib2014-10-19
 232014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] Client 
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 
GMT2014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] Client 
environment:host.name=serverName2014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] 
Client environment:java.version=1.7.0_512014-10-19 23:01:41 o.a.z.ZooKeeper 
[INFO] Client environment:java.vendor=Oracle Corporation2014-10-19 23:01:41 
o.a.z.ZooKeeper [INFO] Client 
environment:java.home=/opt/gpf/realtime/jdk/1.7.0_51l64/jre2014-10-19 23:01:41 
o.a.z.ZooKeeper [INFO] Client 
environment:java.class.path=/opt/gpf/realtime/storm/0.9.2/lib/logback-core-1.0.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/minlog-1.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/logback-classic-1.0.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/slf4j-api-1.6.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/jetty-6.1.26.jar:/opt/gpf/realtime/storm/0.9.2/lib/log4j-over-slf4j-1.6.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/clojure-1.5.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-exec-1.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/asm-4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-io-2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-lang-2.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-codec-1.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/netty-3.2.2.Final.jar:/opt/gpf/realtime/storm/0.9.2/lib/json-simple-1.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/core.incubator-0.1.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/hiccup-0.3.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/httpcore-4.3.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/jgrapht-core-0.9.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/reflectasm-1.07-shaded.jar:/opt/gpf/realtime/storm/0.9.2/lib/guava-13.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/servlet-api-2.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/clj-time-0.4.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/storm-core-0.9.2-incubating.jar:/opt/gpf/realtime/storm/0.9.2/lib/jline-2.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/disruptor-2.10.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/objenesis-1.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/kryo-2.21.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.logging-0.2.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/compojure-1.1.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-jetty-adapter-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/curator-framework-2.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/clj-stacktrace-0.2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-servlet-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/httpclient-4.3.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-devel-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/snakeyaml-1.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/carbonite-1.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/math.numeric-tower-0.0.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/netty-3.6.3.Final.jar:/opt/gpf/realtime/storm/0.9.2/lib/zookeeper-3.4.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-core-1.1.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.macro-0.1.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/clout-1.0.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.cli-0.2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-logging-1.1.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/chill-java-0.3.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-fileupload-1.2.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/curator-client-2.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/jetty-util-6.1.26.jar:/opt/gpf/realtime/storm/0.9.2/lib/joda-time-2.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/servlet-api-2.5-20081211.jar:/opt/gpf/realtime/storm/0.9.2/conf:/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/stormjar.jar2014-10-19
 23:01:41 o.a.z.ZooKeeper [INFO] Client 
environment:java.library.path=/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/resources/Linux-amd64:/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/resources:/usr/local/lib:/opt/local/lib:/usr/lib2014-10-19
 23:01:41 o.a.z.ZooKeeper [INFO] Client 
environment:java.io.tmpdir=/tmp2014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] 
Client environment:java.compiler=<NA>2014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] 
Client environment:os.name=Linux2014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] 
Client environment:os.arch=amd642014-10-19 23:01:41 o.a.z.ZooKeeper [INFO] 
Client environment:os.version=2.6.18-348.25.1.el52014-10-19 23:01:41 
o.a.z.ZooKeeper [INFO] Client environment:user.name=gpfadm2014-10-19 23:01:41 
o.a.z.ZooKeeper [INFO] Client environment:user.home=/home/gpfadm2014-10-19 
23:01:41 o.a.z.ZooKeeper [INFO] Client 
environment:user.dir=/opt/gpf/realtime/bin2014-10-19 23:01:41 
o.a.z.s.ZooKeeperServer [INFO] Server 
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 
GMT2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:host.name=serverName2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer 
[INFO] Server environment:java.version=1.7.0_512014-10-19 23:01:41 
o.a.z.s.ZooKeeperServer [INFO] Server environment:java.vendor=Oracle 
Corporation2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:java.home=/opt/gpf/realtime/jdk/1.7.0_51l64/jre2014-10-19 23:01:41 
o.a.z.s.ZooKeeperServer [INFO] Server 
environment:java.class.path=/opt/gpf/realtime/storm/0.9.2/lib/logback-core-1.0.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/minlog-1.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/logback-classic-1.0.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/slf4j-api-1.6.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/jetty-6.1.26.jar:/opt/gpf/realtime/storm/0.9.2/lib/log4j-over-slf4j-1.6.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/clojure-1.5.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-exec-1.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/asm-4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-io-2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-lang-2.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-codec-1.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/netty-3.2.2.Final.jar:/opt/gpf/realtime/storm/0.9.2/lib/json-simple-1.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/core.incubator-0.1.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/hiccup-0.3.6.jar:/opt/gpf/realtime/storm/0.9.2/lib/httpcore-4.3.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/jgrapht-core-0.9.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/reflectasm-1.07-shaded.jar:/opt/gpf/realtime/storm/0.9.2/lib/guava-13.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/servlet-api-2.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/clj-time-0.4.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/storm-core-0.9.2-incubating.jar:/opt/gpf/realtime/storm/0.9.2/lib/jline-2.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/disruptor-2.10.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/objenesis-1.2.jar:/opt/gpf/realtime/storm/0.9.2/lib/kryo-2.21.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.logging-0.2.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/compojure-1.1.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-jetty-adapter-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/curator-framework-2.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/clj-stacktrace-0.2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-servlet-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/httpclient-4.3.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-devel-0.3.11.jar:/opt/gpf/realtime/storm/0.9.2/lib/snakeyaml-1.11.jar:/opt/gpf
/realtime/storm/0.9.2/lib/carbonite-1.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/math.numeric-tower-0.0.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/netty-3.6.3.Final.jar:/opt/gpf/realtime/storm/0.9.2/lib/zookeeper-3.4.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/ring-core-1.1.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.macro-0.1.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/clout-1.0.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/tools.cli-0.2.4.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-logging-1.1.3.jar:/opt/gpf/realtime/storm/0.9.2/lib/chill-java-0.3.5.jar:/opt/gpf/realtime/storm/0.9.2/lib/commons-fileupload-1.2.1.jar:/opt/gpf/realtime/storm/0.9.2/lib/curator-client-2.4.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/jetty-util-6.1.26.jar:/opt/gpf/realtime/storm/0.9.2/lib/joda-time-2.0.jar:/opt/gpf/realtime/storm/0.9.2/lib/servlet-api-2.5-20081211.jar:/opt/gpf/realtime/storm/0.9.2/conf:/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/stormjar.jar
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:java.library.path=/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/resources/Linux-amd64:/data/PBData/foundation/csl_realtime/data/storm/0.9.2/supervisor/stormdist/csl-realtime-tp-ap-equity-topology-1-1413267046/resources:/usr/local/lib:/opt/local/lib:/usr/lib
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:java.io.tmpdir=/tmp
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:java.compiler=<NA>
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:os.name=Linux
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:os.arch=amd64
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:os.version=2.6.18-348.25.1.el5
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:user.name=gpfadm
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:user.home=/home/gpfadm
2014-10-19 23:01:41 o.a.z.s.ZooKeeperServer [INFO] Server 
environment:user.dir=/opt/gpf/realtime/bin
2014-10-19 23:01:42 b.s.d.worker [INFO] Launching worker for 
csl-realtime-tp-ap-equity-topology-1-1413267046 on 
e18633fb-6613-4afc-bea1-9941084508c5:6702 with id 
00b7ea91-5e19-452e-8b9e-ce035be31867 and conf {"dev.zookeeper.path" 
"/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil, 
"topology.builtin.metrics.bucket.size.secs" 60, 
"topology.fall.back.on.java.serialization" true, 
"topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000, 
"topology.skip.missing.kryo.registrations" false, 
"storm.messaging.netty.client_worker_threads" 1, "ui.childopts" "-Xmx768m", 
"storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, 
"topology.trident.batch.emit.interval.millis" 500, 
"storm.messaging.netty.flush.check.interval.ms" 10, "nimbus.monitor.freq.secs" 
10, "logviewer.childopts" "-Xmx128m", "java.library.path" 
"/usr/local/lib:/opt/local/lib:/usr/lib", "topology.executor.send.buffer.size" 
1024, "storm.local.dir" 
"/data/PBData/foundation/csl_realtime/data/storm/0.9.2", 
"storm.messaging.netty.buffer_size" 5242880, 
"supervisor.worker.start.timeout.secs" 120, "topology.enable.message.timeouts" 
true, "nimbus.cleanup.inbox.freq.secs" 600, "nimbus.inbox.jar.expiration.secs" 
3600, "drpc.worker.threads" 64, "topology.worker.shared.thread.pool.size" 4, 
"nimbus.host" "serverName.nam.nsroot.net", "storm.messaging.netty.min_wait_ms" 
100, "storm.zookeeper.port" 7181, "transactional.zookeeper.port" nil, 
"topology.executor.receive.buffer.size" 1024, "transactional.zookeeper.servers" 
nil, "storm.zookeeper.root" "/storm", 
"storm.zookeeper.retry.intervalceiling.millis" 30000, "supervisor.enable" true, 
"storm.messaging.netty.server_worker_threads" 1, "storm.zookeeper.servers" 
["serverName.nam.nsroot.net"], "transactional.zookeeper.root" "/transactional", 
"topology.acker.executors" nil, "topology.transfer.buffer.size" 1024, 
"topology.worker.childopts" nil, "drpc.queue.size" 128, "worker.childopts" 
"-Xmx2048m -Djava.net.preferIPv4Stack=true -verbose:gc -XX:+PrintGCTimeStamps 
-XX:+PrintGCDetails -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.port=1%ID%", 
"supervisor.heartbeat.frequency.secs" 5, 
"topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772, 
"supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m", 
"topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3, 
"topology.tasks" nil, "storm.messaging.netty.max_retries" 30, 
"topology.spout.wait.strategy" "backtype.storm.spout.SleepSpoutWaitStrategy", 
"nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil, 
"storm.zookeeper.retry.interval" 1000, 
"topology.sleep.spout.wait.strategy.time.ms" 1, "nimbus.topology.validator" 
"backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports" 
[6700 6701 6702 6703], "topology.debug" false, "nimbus.task.launch.secs" 120, 
"nimbus.supervisor.timeout.secs" 60, "topology.message.timeout.secs" 30, 
"task.refresh.poll.secs" 10, "topology.workers" 1, "supervisor.childopts" 
"-Djava.net.preferIPv4Stack=true -verbose:gc -XX:+PrintGCTimeStamps 
-XX:+PrintGCDetails -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.port=29999", "nimbus.thrift.port" 6627, 
"topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, 
"topology.tuple.serializer" 
"backtype.storm.serialization.types.ListDelegateSerializer", 
"topology.disruptor.wait.strategy" "com.lmax.disruptor.BlockingWaitStrategy", 
"topology.multilang.serializer" "backtype.storm.multilang.JsonSerializer", 
"nimbus.task.timeout.secs" 30, "storm.zookeeper.connection.timeout" 15000, 
"topology.kryo.factory" "backtype.storm.serialization.DefaultKryoFactory", 
"drpc.invocations.port" 3773, "logviewer.port" 8000, "zmq.threads" 1, 
"storm.zookeeper.retry.times" 5, "topology.worker.receiver.thread.count" 1, 
"storm.thrift.transport" "backtype.storm.security.auth.SimpleTransportPlugin", 
"topology.state.synchronization.timeout.secs" 60, 
"supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, 
"storm.messaging.transport" "backtype.storm.messaging.netty.Context", 
"logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms" 1000, 
"drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port" 18087, 
"nimbus.childopts" "-Xmx2048m -Djava.net.preferIPv4Stack=true -verbose:gc 
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.port=19999", "storm.cluster.mode" "distributed", 
"topology.max.task.parallelism" nil, 
"storm.messaging.netty.transfer.batch.size" 262144}
2014-10-19 23:01:42 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2014-10-19 23:01:42 o.a.z.ZooKeeper [INFO] Initiating client connection, 
connectString=serverName.nam.nsroot.net:7181 sessionTimeout=20000 
watcher=org.apache.curator.ConnectionState@79acfa95<mailto:watcher=org.apache.curator.ConnectionState@79acfa95>
2014-10-19 23:01:42 o.a.z.ClientCnxn [INFO] Opening socket connection to server 
serverName/ip:7181. Will not attempt to authenticate using SASL (unknown error)
2014-10-19 23:01:42 o.a.z.ClientCnxn [INFO] Socket connection established to 
serverName/ip:7181, initiating session
2014-10-19 23:01:42 o.a.z.ClientCnxn [INFO] Session establishment complete on 
server serverName/ip:7181, sessionid = 0x1492848710d0021, negotiated timeout = 
20000
2014-10-19 23:01:42 o.a.c.f.s.ConnectionStateManager [INFO] State change: 
CONNECTED
2014-10-19 23:01:42 o.a.c.f.s.ConnectionStateManager [WARN] There are no 
ConnectionStateListeners registered.
2014-10-19 23:01:42 b.s.zookeeper [INFO] Zookeeper state update: :connected:none
2014-10-19 23:01:43 o.a.z.ZooKeeper [INFO] Session: 0x1492848710d0021 closed
2014-10-19 23:01:43 o.a.z.ClientCnxn [INFO] EventThread shut down
2014-10-19 23:01:43 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2014-10-19 23:01:43 o.a.z.ZooKeeper [INFO] Initiating client connection, 
connectString=serverName.nam.nsroot.net:7181/storm sessionTimeout=20000 
watcher=org.apache.curator.ConnectionState@7e56f324<mailto:watcher=org.apache.curator.ConnectionState@7e56f324>
2014-10-19 23:01:43 o.a.z.ClientCnxn [INFO] Opening socket connection to server 
serverName/ip:7181. Will not attempt to authenticate using SASL (unknown error)
2014-10-19 23:01:43 o.a.z.ClientCnxn [INFO] Socket connection established to 
serverName/ip:7181, initiating session
2014-10-19 23:01:43 o.a.z.ClientCnxn [INFO] Session establishment complete on 
server serverName/ip:7181, sessionid = 0x1492848710d0022, negotiated timeout = 
20000
2014-10-19 23:01:43 o.a.c.f.s.ConnectionStateManager [INFO] State change: 
CONNECTED
2014-10-19 23:01:43 o.a.c.f.s.ConnectionStateManager [WARN] There are no 
ConnectionStateListeners registered.
2014-10-19 23:01:43 b.s.m.TransportFactory [INFO] Storm peer transport 
plugin:backtype.storm.messaging.netty.Context
2014-10-19 23:01:44 b.s.d.executor [INFO] Loading executor __acker:[2 2]
... prepare log omitted here
2014-10-19 23:01:44 b.s.d.executor [INFO] Finished loading executor 
kafkaSpout:[92 92]
2014-10-19 23:01:44 b.s.d.executor [INFO] Opening spout kafkaSpout:(92)
2014-10-19 23:01:44 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2014-10-19 23:01:44 o.a.z.ZooKeeper [INFO] Initiating client connection, 
connectString=serverName.nam.nsroot.net:7181, sessionTimeout=20000 
watcher=org.apache.curator.ConnectionState@229f29d<mailto:watcher=org.apache.curator.ConnectionState@229f29d>
2014-10-19 23:01:44 b.s.d.executor [INFO] Loading executor splitBolt:[93 93]
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Opening socket connection to server 
serverName/ip:7181. Will not attempt to authenticate using SASL (unknown error)
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Socket connection established to 
serverName/ip:7181, initiating session
2014-10-19 23:01:44 b.s.d.executor [INFO] Loaded executor tasks splitBolt:[93 
93]
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Session establishment complete on 
server serverName/ip:7181, sessionid = 0x1492848710d0023, negotiated timeout = 
20000
2014-10-19 23:01:44 o.a.c.f.s.ConnectionStateManager [INFO] State change: 
CONNECTED
2014-10-19 23:01:44 o.a.c.f.s.ConnectionStateManager [WARN] There are no 
ConnectionStateListeners registered.
2014-10-19 23:01:44 b.s.d.executor [INFO] Finished loading executor 
splitBolt:[93 93]
2014-10-19 23:01:44 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2014-10-19 23:01:44 o.a.z.ZooKeeper [INFO] Initiating client connection, 
connectString=serverName.nam.nsroot.net:2181 sessionTimeout=20000 
watcher=org.apache.curator.ConnectionState@613d77af<mailto:watcher=org.apache.curator.ConnectionState@613d77af>
2014-10-19 23:01:44 b.s.d.executor [INFO] Preparing bolt splitBolt:(93)
2014-10-19 23:01:44 b.s.d.executor [INFO] Prepared bolt splitBolt:(93)
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Opening socket connection to server 
serverName/ip:2181. Will not attempt to authenticate using SASL (unknown error)
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Socket connection established to 
serverName/ip:2181, initiating session
2014-10-19 23:01:44 b.s.d.executor [INFO] Loading executor __system:[-1 -1]
2014-10-19 23:01:44 o.a.z.ClientCnxnSocket [WARN] Connected to an old server; 
r-o mode will be unavailable
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Session establishment complete on 
server serverName/ip:2181, sessionid = 0x14928475c440010, negotiated timeout = 
20000
2014-10-19 23:01:44 o.a.c.f.s.ConnectionStateManager [INFO] State change: 
CONNECTED
2014-10-19 23:01:44 o.a.c.f.s.ConnectionStateManager [WARN] There are no 
ConnectionStateListeners registered.
2014-10-19 23:01:44 b.s.d.executor [INFO] Loaded executor tasks __system:[-1 -1]
2014-10-19 23:01:44 b.s.d.executor [INFO] Finished loading executor 
__system:[-1 -1]
2014-10-19 23:01:44 b.s.d.executor [INFO] Preparing bolt __system:(-1)
2014-10-19 23:01:44 s.k.DynamicBrokersReader [INFO] Read partition info from 
zookeeper: GlobalPartitionInformation{partitionMap={0=serverName:9092}}
2014-10-19 23:01:44 b.s.d.executor [INFO] Prepared bolt __system:(-1)
2014-10-19 23:01:44 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2014-10-19 23:01:44 o.a.z.ZooKeeper [INFO] Initiating client connection, 
connectString=serverName.nam.nsroot.net:2181 sessionTimeout=20000 
watcher=org.apache.curator.ConnectionState@2a3f0cf8<mailto:watcher=org.apache.curator.ConnectionState@2a3f0cf8>
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Opening socket connection to server 
serverName/ip:2181. Will not attempt to authenticate using SASL (unknown error)
2014-10-19 23:01:44 b.s.d.executor [INFO] Loading executor __acker:[1 1]
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Socket connection established to 
serverName/ip:2181, initiating session
2014-10-19 23:01:44 b.s.d.executor [INFO] Loaded executor tasks __acker:[1 1]
2014-10-19 23:01:44 o.a.z.ClientCnxnSocket [WARN] Connected to an old server; 
r-o mode will be unavailable
2014-10-19 23:01:44 o.a.z.ClientCnxn [INFO] Session establishment complete on 
server serverName/ip:2181, sessionid = 0x14928475c440011, negotiated timeout = 
20000
2014-10-19 23:01:44 b.s.d.executor [INFO] Preparing bolt __acker:(1)
2014-10-19 23:01:44 b.s.d.executor [INFO] Timeouts disabled for executor 
__acker:[1 1]
2014-10-19 23:01:44 o.a.c.f.s.ConnectionStateManager [INFO] State change: 
CONNECTED
2014-10-19 23:01:44 b.s.d.executor [INFO] Opened spout kafkaSpout:(92)
2014-10-19 23:01:44 o.a.c.f.s.ConnectionStateManager [WARN] There are no 
ConnectionStateListeners registered.
2014-10-19 23:01:44 b.s.d.executor [INFO] Finished loading executor __acker:[1 
1]
2014-10-19 23:01:44 b.s.d.executor [INFO] Prepared bolt __acker:(1)
2014-10-19 23:01:44 b.s.d.executor [INFO] Activating spout kafkaSpout:(92)
2014-10-19 23:01:44 s.k.ZkCoordinator [INFO] Task [1/1] Refreshing partition 
manager connections
2014-10-19 23:01:44 s.k.DynamicBrokersReader [INFO] Read partition info from 
zookeeper: GlobalPartitionInformation{partitionMap={0=serverName:9092}}
2014-10-19 23:01:44 s.k.KafkaUtils [INFO] Task [1/1] assigned 
[Partition{host=serverName:9092, partition=0}]
2014-10-19 23:01:44 s.k.ZkCoordinator [INFO] Task [1/1] Deleted partition 
managers: []
2014-10-19 23:01:44 s.k.ZkCoordinator [INFO] Task [1/1] New partition managers: 
[Partition{host=serverName:9092, partition=0}]
2014-10-19 23:01:44 b.s.d.executor [INFO] Loading executor eqpbOlapBolt:[36 39]
2014-10-19 23:01:44 b.s.d.executor [INFO] Loaded executor tasks 
eqpbOlapBolt:[36 39]
... Omit prepare bolt log here
2014-10-19 23:01:44 b.s.d.executor [INFO] Finished loading executor 
eqpbOlapBolt:[64 67]
2014-10-19 23:01:44 b.s.d.worker [INFO] Launching receive-thread for 
e18633fb-6613-4afc-bea1-9941084508c5:6702
2014-10-19 23:01:44 b.s.d.executor [INFO] Preparing bolt eqpbOlapBolt:(64 65 66 
67)
2014-10-19 23:01:44 b.s.m.n.Server [INFO] Create Netty Server 
Netty-server-localhost-6702, buffer_size: 5242880, maxWorkers: 1
2014-10-19 23:01:44 b.s.m.loader [INFO] Starting receive-thread: [stormId: 
csl-realtime-tp-ap-equity-topology-1-1413267046, port: 6702, thread-id: 0 ]
2014-10-19 23:01:44 b.s.d.worker [INFO] Worker has topology config {"storm.id" 
"csl-realtime-tp-ap-equity-topology-1-1413267046", "dev.zookeeper.path" 
"/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil, 
"topology.builtin.metrics.bucket.size.secs" 60, 
"topology.fall.back.on.java.serialization" true, 
"topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000, 
"topology.skip.missing.kryo.registrations" false, 
"storm.messaging.netty.client_worker_threads" 1, "ui.childopts" "-Xmx768m", 
"storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, 
"topology.trident.batch.emit.interval.millis" 500, 
"storm.messaging.netty.flush.check.interval.ms" 10, "nimbus.monitor.freq.secs" 
10, "logviewer.childopts" "-Xmx128m", "java.library.path" 
"/usr/local/lib:/opt/local/lib:/usr/lib", "topology.executor.send.buffer.size" 
1024, "storm.local.dir" 
"/data/PBData/foundation/csl_realtime/data/storm/0.9.2", 
"storm.messaging.netty.buffer_size" 5242880, 
"supervisor.worker.start.timeout.secs" 120, "topology.enable.message.timeouts" 
true, "nimbus.cleanup.inbox.freq.secs" 600, "nimbus.inbox.jar.expiration.secs" 
3600, "drpc.worker.threads" 64, "topology.worker.shared.thread.pool.size" 4, 
"nimbus.host" "serverName.nam.nsroot.net", "storm.messaging.netty.min_wait_ms" 
100, "storm.zookeeper.port" 7181, "transactional.zookeeper.port" nil, 
"topology.executor.receive.buffer.size" 1024, "transactional.zookeeper.servers" 
nil, "storm.zookeeper.root" "/storm", 
"storm.zookeeper.retry.intervalceiling.millis" 30000, "supervisor.enable" true, 
"storm.messaging.netty.server_worker_threads" 1, "storm.zookeeper.servers" 
["serverName.nam.nsroot.net"], "transactional.zookeeper.root" "/transactional", 
"topology.acker.executors" 10, "topology.kryo.decorators" (), "topology.name" 
"csl-realtime-tp-ap-equity-topology", "topology.transfer.buffer.size" 1024, 
"topology.worker.childopts" nil, "drpc.queue.size" 128, "worker.childopts" 
"-Xmx2048m -Djava.net.preferIPv4Stack=true -verbose:gc -XX:+PrintGCTimeStamps 
-XX:+PrintGCDetails -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.port=1%ID%", 
"supervisor.heartbeat.frequency.secs" 5, 
"topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772, 
"supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m", 
"topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3, 
"topology.tasks" nil, "storm.messaging.netty.max_retries" 30, 
"topology.spout.wait.strategy" "backtype.storm.spout.SleepSpoutWaitStrategy", 
"nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" 5000, 
"storm.zookeeper.retry.interval" 1000, 
"topology.sleep.spout.wait.strategy.time.ms" 1, "nimbus.topology.validator" 
"backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports" 
[6700 6701 6702 6703], "topology.debug" false, "nimbus.task.launch.secs" 120, 
"nimbus.supervisor.timeout.secs" 60, "topology.kryo.register" nil, 
"topology.message.timeout.secs" 3600, "task.refresh.poll.secs" 10, 
"topology.workers" 1, "supervisor.childopts" "-Djava.net.preferIPv4Stack=true 
-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails 
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.port=29999", "nimbus.thrift.port" 6627, 
"topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, 
"topology.tuple.serializer" 
"backtype.storm.serialization.types.ListDelegateSerializer", 
"topology.disruptor.wait.strategy" "com.lmax.disruptor.BlockingWaitStrategy", 
"topology.multilang.serializer" "backtype.storm.multilang.JsonSerializer", 
"nimbus.task.timeout.secs" 30, "storm.zookeeper.connection.timeout" 15000, 
"topology.kryo.factory" "backtype.storm.serialization.DefaultKryoFactory", 
"drpc.invocations.port" 3773, "logviewer.port" 8000, "zmq.threads" 1, 
"storm.zookeeper.retry.times" 5, "topology.worker.receiver.thread.count" 1, 
"storm.thrift.transport" "backtype.storm.security.auth.SimpleTransportPlugin", 
"topology.state.synchronization.timeout.secs" 60, 
"supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, 
"storm.messaging.transport" "backtype.storm.messaging.netty.Context", 
"logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms" 1000, 
"drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port" 18087, 
"nimbus.childopts" "-Xmx2048m -Djava.net.preferIPv4Stack=true -verbose:gc 
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false 
-Dcom.sun.management.jmxremote.local.only=false 
-Dcom.sun.management.jmxremote.port=19999", "storm.cluster.mode" "distributed", 
"topology.max.task.parallelism" nil, 
"storm.messaging.netty.transfer.batch.size" 262144}
2014-10-19 23:01:44 b.s.d.worker [INFO] Worker 
00b7ea91-5e19-452e-8b9e-ce035be31867 for storm 
csl-realtime-tp-ap-equity-topology-1-1413267046 on 
e18633fb-6613-4afc-bea1-9941084508c5:6702 has finished loading
2014-10-19 23:01:44 o.s.b.f.x.XmlBeanDefinitionReader [INFO] Loading XML bean 
definitions from class path resource [csl_saving_service_context.xml]
2014-10-19 23:01:44 o.s.b.f.x.XmlBeanDefinitionReader [INFO] Loading XML bean 
definitions from class path resource [spring_mail_context.xml]
2014-10-19 23:01:44 s.k.PartitionManager [INFO] Read partition information 
from: /storm/partition_0  --> 
{"topic":"citi.gpf_162842.pfdwh.RT.kafka.topic.equity","partition":0,"topology":{"id
":"9b24e9c0-b9cb-4d7c-8659-2b3ff9ca4a43","name":"csl-realtime-tp-ap-equity-topology"},"broker":{"port":9092,"host":"serverName"},"offset":588012}
2014-10-19 23:01:45 s.k.PartitionManager [INFO] Read last commit offset from 
zookeeper: 588012; old topology_id: 9b24e9c0-b9cb-4d7c-8659-2b3ff9ca4a43 - new 
topology_id: 9b24e9c0-b9cb-4d7c-8659-2b3ff9ca4a43
2014-10-19 23:01:45 s.k.PartitionManager [INFO] Starting Kafka serverName:0 
from offset 588012
2014-10-19 23:01:45 s.k.ZkCoordinator [INFO] Task [1/1] Finished refreshing
2014-10-19 23:01:45 o.s.b.f.c.PropertyPlaceholderConfigurer [INFO] Loading 
properties file from class path resource 
[uat/applicationResourcesStorm.properties]
2014-10-19 23:01:45 o.s.b.f.s.DefaultListableBeanFactory [INFO] 
Pre-instantiating singletons in 
org.springframework.beans.factory.support.DefaultListableBeanFactory@36cf2a48<mailto:org.springframework.beans.factory.support.DefaultListableBeanFactory@36cf2a48>:
 defining beans 
[org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,txManager,org.springframework.aop.config.internalAutoProxyCreator,org.springframework.transaction.annotation.AnnotationTransactionAttributeSource#0,org.springframework.transaction.interceptor.TransactionInterceptor#0,org.springframework.transaction.config.internalTransactionAdvisor,olapDataSource,olapDAO,fetPayloadDAO,fbtPayloadDAO,frtPayloadDAO,insFactPositionSP,insFactEquityPositionSP,insFactFIPositionWithDetailSP,insFactBalanceSP,olapPersitenceService,insFactCommonTransctionProc,insFactEquityProc,insFactClientOrigDetailProc,GPBalanceUtil,EQPBBalanceUtil,EQPBPositionUtil,FIPBBalanceUtil,FIPBPositionUtil,utilManager,mailSender,emailHelper,mailSenderHelper,mailSenderUtil,duplikeyIssueMailList];
 root of factory hierarchy
2014-10-19 23:01:45 c.c.g.c.u.e.MailSenderHelperImpl [INFO] mail service is 
startted
2014-10-19 23:01:45 c.c.g.c.u.e.MailSenderHelperImpl [INFO] 
============================================================
2014-10-19 23:01:45 b.s.d.executor [INFO] Prepared bolt eqpbEnrichBolt:(27)
2014-10-19 23:01:45 b.s.d.executor [INFO] Prepared bolt eqpbOlapBolt:(29 28 31 
30)
... Omit prepare bolt log here
2014-10-19 23:01:45 b.s.d.executor [INFO] Prepared bolt eqpbOlapBolt:(59 58 57 
56)
2014-10-19 23:02:45 s.k.ZkCoordinator [INFO] Task [1/1] Refreshing partition 
manager connections

Hope this is all needed,

Regards,
Yiming

-----Original Message-----
From: Harsha [mailto:[email protected]]
Sent: Friday, October 17, 2014 9:56 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: nimbus detect executor not alive and reassign tasks issue

Do you see any errors in worker logs thats causing executor failures.
Are you ack the tuple once its successfully written to the database.

On Thu, Oct 16, 2014, at 11:15 PM, Fang, Yiming  wrote:
> Dear All,
>
> I have a nimbus detect executor not alive and reassign tasks issue:
>
> We are working on 0.9.2 version of storm + 3.4.6 version for zookeeper.
> And it's a single node with 1 instance of storm relies on 1 instance
> of zookeeper On the same machine.
>
> Accidently we get the following issue, about 1-2 hours, the nimbus.log
> tells that No executor is alive like the following:
>
> 2014-10-17 01:30:58 b.s.d.nimbus [INFO] Executor xxx :[2 2] not alive
> ... executor not alive code here ommitted
> 2014-10-17 01:30:58 b.s.d.nimbus [INFO] Executor xxx :[64 67] not
> alive
> 2014-10-17 01:30:58 b.s.s.EvenScheduler [INFO] Available slots:
> (["e18633fb-6613-4afc-bea1-9941084508c5" 6702]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6701]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6700])
> 2014-10-17 01:30:58 b.s.d.nimbus [INFO] Reassigning xxx to 1 slots
> 2014-10-17 01:30:58 b.s.d.nimbus [INFO] Reassign executors: [[2 2] [3
> 3]
> [4 4] [5 5] [6 6] [7 7] [8 8] [9 9] [10 10] [27 27] [92 92] [93 93] [1
> 1]
> [36 39] [68 71] [40 43] [72 75] [11 14] [44 47] [76 79] [15 18] [48
> 51]
> [80 83] [19 22] [52 55] [84 87] [23 26] [56 59] [88 91] [28 31] [60
> 63]
> [32 35] [64 67]]
> 2014-10-17 01:30:58 b.s.d.nimbus [INFO] Setting new assignment for
> topology id xxx:
> #backtype.storm.daemon.common.Assignment{:master-code-dir "yyy",
> :node->host {"e18633fb-6613-4afc-bea1-9941084508c5" "pfdwlnx1u"},
> :executor->node+port {[2 2] ["e18633fb-6613-4afc-bea1-9941084508c5"
> 6702], [3 3] ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [4 4]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [5 5]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [6 6]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [7 7]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [8 8]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [9 9]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [10 10]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [27 27]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [92 92]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [93 93]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [1 1]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [36 39]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [68 71]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [40 43]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [72 75]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [11 14]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [44 47]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [76 79]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [15 18]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [48 51]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [80 83]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [19 22]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [52 55]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [84 87]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [23 26]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [56 59]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [88 91]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [28 31]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [60 63]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [32 35]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702], [64 67]
> ["e18633fb-6613-4afc-bea1-9941084508c5" 6702]},
> :executor->start-time-secs {[2 2] 1413523858, [3 3] 1413523858, [4 4]
> 1413523858, [5 5] 1413523858, [6 6] 1413523858, [7 7] 1413523858, [8
> 8] 1413523858, [9 9] 1413523858, [10 10] 1413523858, [27 27]
> 1413523858, [92 92] 1413523858, [93 93] 1413523858, [1 1] 1413523858,
> [36 39] 1413523858,
> [68 71] 1413523858, [40 43] 1413523858, [72 75] 1413523858, [11 14]
> 1413523858, [44 47] 1413523858, [76 79] 1413523858, [15 18]
> 1413523858,
> [48 51] 1413523858, [80 83] 1413523858, [19 22] 1413523858, [52 55]
> 1413523858, [84 87] 1413523858, [23 26] 1413523858, [56 59]
> 1413523858,
> [88 91] 1413523858, [28 31] 1413523858, [60 63] 1413523858, [32 35]
> 1413523858, [64 67] 1413523858}}
>
> Then it reassign the tasks to supervisor. The problem is as we use
> Kafka Spout to get messages from Kafka server. We observed that
> Certain messages have already been processed before the reassign of
> tasks (persist into a Data Base) and the reassign of tasks make those
> messages been reflowed as I believe that the offset of those already
> been treated kafka messages not committed to the zookeeper server.
>
> So questions are:
> 1) I know it's one of the fault tolerant feature provided by storm and
> kafka, is this a normal case? Anything wrong with the not alive issue?
> 2) If there's potential issue with task not alive, how could I check
> in details why the task is not alive? As I believe that the nimbus,
> supervisor process never dies in our test case.
>
> Hope I make things clear if anything not clear let me know.
>
> Thanks and regards,
> Yiming
>

Reply via email to