Yes, GC might be the issue. I read somewhere in archives.
http://grokbase.com/t/gg/storm-user/133mdhagd0/worker-dying


On Tue, Sep 9, 2014 at 5:09 PM, Palak Shah <spala...@gmail.com> wrote:

> I looked at the supervisor and nimbus logs and confirmed that the topology
> is being rebalanced. It could be because of timeout. Could you help me
> figure out exactly why this timeout occurs and how I can fix it?
>
> I have enabled GC for my workers and supervisor. Could it be the reason
> that worker is not able to send a heartbeat? I tried increasing the heap
> size allotted to each worker by tweaking the value of worker.childopts to
> have "-Xmx768m". But I did not see any difference in behaviour of my
> topology. How can I fix this issue?
>
> here are my supervisor logs -
>
> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Shutting down and clearing
> state for id a8733f89-9f41-4624-9bce-d9f2d8c449ee. Current supervisor time:
> 1410261997. State: :timed-out, Heartbeat:
> #backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1410261964,
> :storm-id "popeyeTopology-2-1410260830", :executors #{[4 4] [7 7] [-1 -1]
> [1 1]}, :port 6700}
> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Shutting down
> 9f26d478-1963-425e-a0c2-139712f32b9e:a8733f89-9f41-4624-9bce-d9f2d8c449ee
> 2014-09-09 16:56:37 b.s.util [INFO] Error when trying to kill 28950.
> Process is probably already dead.
> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Shut down
> 9f26d478-1963-425e-a0c2-139712f32b9e:a8733f89-9f41-4624-9bce-d9f2d8c449ee
> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Launching worker with
> assignment #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id
> "popeyeTopology-2-1410260830", :executors ([7 7] [4 4] [1 1])} for this
> supervisor 9f26d478-1963-425e-a0c2-139712f32b9e on port 6700 with id
> 78233b86-b7f3-4590-9709-ba0d71130d7e
> 2014-09-09 16:56:37 b.s.d.supervisor [INFO] Launching worker with command:
> '/usr/lib/jvm/java-7-oracle/bin/java' '-server' '-Xmx768m' '-verbose:gc'
> '-XX:+PrintGCTimeStamps' '-XX:+PrintGCDetails'
> '-Dcom.sun.management.jmxremote' '-Dcom.sun.management.jmxremote.ssl=false'
> '-Dcom.sun.management.jmxremote.authenticate=false'
> '-Dcom.sun.management.jmxremote.port=16700'
> '-Djava.library.path=/tmp/stormtmp/supervisor/stormdist/popeyeTopology-2-1410260830/resources/Linux-amd64:/tmp/stormtmp/supervisor/stormdist/popeyeTopology-2-1410260830/resources:/usr/lib/jvm/java-7-oracle/lib'
> '-Dlogfile.name=worker-6700.log'
> '-Dstorm.home=/home/stormcluster/Storm/apache-storm-0.9.2-incubating'
> '-Dlogback.configurationFile=/home/stormcluster/Storm/apache-storm-0.9.2-incubating/logback/cluster.xml'
> '-Dstorm.id=popeyeTopology-2-1410260830'
> '-Dworker.id=78233b86-b7f3-4590-9709-ba0d71130d7e' '-Dworker.port=6700'
> '-cp'
> '/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/chill-java-0.3.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-exec-1.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-io-2.4.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/joda-time-2.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-servlet-0.3.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/scala-library-2.9.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clout-1.0.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/logback-core-1.0.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/json-simple-1.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/curator-client-2.4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/carbonite-1.4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/asm-4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/httpcore-4.3.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jetty-util-6.1.26.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clj-stacktrace-0.2.4.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/log4j-over-slf4j-1.6.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/netty-3.2.2.Final.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/compojure-1.1.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/zookeeper-3.4.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-devel-0.3.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jgrapht-core-0.9.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/guava-13.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/objenesis-1.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/minlog-1.2.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/core.incubator-0.1.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clj-time-0.4.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/servlet-api-2.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/storm-core-0.9.2-incubating.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-logging-1.1.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/logback-classic-1.0.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/clojure-1.5.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-jetty-adapter-0.3.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/tools.macro-0.1.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/reflectasm-1.07-shaded.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/math.numeric-tower-0.0.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-fileupload-1.2.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/servlet-api-2.5-20081211.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jetty-6.1.26.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/jline-2.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/tools.cli-0.2.4.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/slf4j-api-1.6.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/hiccup-0.3.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-codec-1.6.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/snakeyaml-1.11.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/zmq.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/netty-3.6.3.Final.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/storm-kafka-0.9.2-incubating.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/ring-core-1.1.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/kafka_2.9.2-0.8.1.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/tools.logging-0.2.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/kryo-2.21.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/disruptor-2.10.1.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/httpclient-4.3.3.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/commons-lang-2.5.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/lib/curator-framework-2.4.0.jar:/home/stormcluster/Storm/apache-storm-0.9.2-incubating/conf:/tmp/stormtmp/supervisor/stormdist/popeyeTopology-2-1410260830/stormjar.jar'
> 'backtype.storm.daemon.worker' 'popeyeTopology-2-1410260830'
> '9f26d478-1963-425e-a0c2-139712f32b9e' '6700'
> '78233b86-b7f3-4590-9709-ba0d71130d7e'
> 2014-09-09 16:56:37 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
> 2014-09-09 16:56:37 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
> 2014-09-09 16:56:38 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
> 2014-09-09 16:56:38 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
> 2014-09-09 16:56:39 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
> 2014-09-09 16:56:39 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
> 2014-09-09 16:56:40 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
> 2014-09-09 16:56:40 b.s.d.supervisor [INFO]
> 78233b86-b7f3-4590-9709-ba0d71130d7e still hasn't started
>
>
> Thanks,
> Palak
>
> On Mon, Sep 8, 2014 at 6:46 PM, Cyrille Karmann <cyrillekarm...@gmail.com>
> wrote:
>
>> Look at the supervisors logs. There could be a timeout that make a
>> supervisor to force a worker to shut down, triggering a rebalancing.
>>
>>
>>
>> 2014-09-08 7:17 GMT-04:00 Palak Shah <spala...@gmail.com>:
>>
>> I have a topology that uses a Kafka spout to read values from a Kafka
>>> queue. I used the kafkaSpout that came with storm-0.9.2-incubating.
>>>
>>> I observed that the nimbus rebalances the topology very often. The
>>> topology suddenly shuts down and starts again with the tasks running on
>>> different machines. I wanted to know why storm nimbus is rebalancing my
>>> topology, so I observed the storm throughput, latency, load on bolts and
>>> even system metrics like cpu and memory utilization, but I could not see a
>>> pattern.
>>>
>>> Can someone explain what are the factors that lead to rebalancing of
>>> topology in storm?
>>>
>>> Thanks,
>>> Palak Shah
>>>
>>
>>
>>
>> --
>> Cyrille Karmann
>> +1-514-659-1209
>> cyrillekarm...@gmail.com
>>
>
>


-- 
Regards,
Vikas Agarwal
91 – 9928301411

InfoObjects, Inc.
Execution Matters
http://www.infoobjects.com
2041 Mission College Boulevard, #280
Santa Clara, CA 95054
+1 (408) 988-2000 Work
+1 (408) 716-2726 Fax

Reply via email to