Hi,

I am observing that my storm topology intermediately freezes and does
not continue to process tuples from Kafka. This happens frequently and
when it happens this freeze lasts for 5 to 15 minutes. No content is
written to any of the worker log files during this time.

The version of storm I use is 1.0.2 and Kafka version is 0.9.0.

Any suggestions to solve the issue ?

Thanks,
Sreeram

Supervisor log at the time of freeze looks like below

2017-07-12 14:38:46.712 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:47.212 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:47.712 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:48.213 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:48.713 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:49.213 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:49.713 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:50.214 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
2017-07-12 14:38:50.714 o.a.s.d.supervisor [INFO]
d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started


Thread stacks (sample)
Most of worker threads during this freeze period look like one of the
below two stack traces.

Thread 104773: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
information may be imprecise)
 - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object,
long) @bci=20, line=215 (Compiled frame)
 - 
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.util.concurrent.SynchronousQueue$TransferStack$SNode,
boolean, long) @bci=160, line=460 (Compil
ed frame)
 - 
java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object,
boolean, long) @bci=102, line=362 (Compiled frame)
 - java.util.concurrent.SynchronousQueue.poll(long,
java.util.concurrent.TimeUnit) @bci=11, line=941 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=134,
line=1066 (Compiled frame)
 - 
java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
@bci=26, line=1127 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5,
line=617 (Compiled frame)
 - java.lang.Thread.run() @bci=11, line=745 (Compiled frame)

 Thread 147495: (state = IN_NATIVE)
 - sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) @bci=0
(Compiled frame; information may be imprecise)
 - sun.nio.ch.EPollArrayWrapper.poll(long) @bci=18, line=269 (Compiled frame)
 - sun.nio.ch.EPollSelectorImpl.doSelect(long) @bci=28, line=93 (Compiled frame)
 - sun.nio.ch.SelectorImpl.lockAndDoSelect(long) @bci=37, line=86
(Compiled frame)
 - sun.nio.ch.SelectorImpl.select(long) @bci=30, line=97 (Compiled frame)
 - org.apache.kafka.common.network.Selector.select(long) @bci=35,
line=425 (Compiled frame)
 - org.apache.kafka.common.network.Selector.poll(long) @bci=81,
line=254 (Compiled frame)
 - org.apache.kafka.clients.NetworkClient.poll(long, long) @bci=84,
line=270 (Compiled frame)
 - org.apache.kafka.clients.producer.internals.Sender.run(long)
@bci=343, line=216 (Compiled frame)
 - org.apache.kafka.clients.producer.internals.Sender.run() @bci=27,
line=128 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=745 (Compiled frame)

Reply via email to