Sushant Kumar created STORM-299:
-----------------------------------
Summary: tick tuples stop in a long running topology
Key: STORM-299
URL: https://issues.apache.org/jira/browse/STORM-299
Project: Apache Storm (Incubating)
Issue Type: Bug
Affects Versions: 0.9.0.1
Reporter: Sushant Kumar
We are running a topology that processes about 10K tuples a second on a single
worker. There are about 80 executor threads across 3 different Bolt types, each
receiving a tick tuple at a frequency of 1 second. We observe that the tick
tuples stop coming in after few hours or sometime even after a day. The nimbus
UI also shows that the tick tuples count has frozen whereas the same set of
bolts continue to process tuples from other spouts. I have taken a stack trace
and it shows following:
"Thread-8" daemon prio=10 tid=0x00007f1a0fed2000 nid=0x4438 runnable
[0x00007f19a5e01000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:349)
at
com.lmax.disruptor.AbstractMultithreadedClaimStrategy.waitForFreeSlotAt(AbstractMultithreadedClaimStrategy.java:99)
at
com.lmax.disruptor.AbstractMultithreadedClaimStrategy.incrementAndGet(AbstractMultithreadedClaimStrategy.java:49)
at com.lmax.disruptor.Sequencer.next(Sequencer.java:127)
at backtype.storm.utils.DisruptorQueue.publish(DisruptorQueue.java:113)
at backtype.storm.disruptor$publish.invoke(disruptor.clj:51)
at backtype.storm.disruptor$publish.invoke(disruptor.clj:53)
at
backtype.storm.daemon.executor$setup_ticks_BANG_$fn__3885.invoke(executor.clj:299)
at
backtype.storm.timer$schedule_recurring$this__1839.invoke(timer.clj:79)
at
backtype.storm.timer$mk_timer$thread_fn__1822$fn__1823.invoke(timer.clj:32)
at backtype.storm.timer$mk_timer$thread_fn__1822.invoke(timer.clj:25)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- None
"Thread-268-__system" prio=10 tid=0x00007f1a0ef45000 nid=0x46b5 runnable
[0x00007f17ef574000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000073d9a2060> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176)
at
com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:87)
at
com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:54)
at
backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:56)
at
backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
at
backtype.storm.daemon.executor$eval4023$fn__4024$fn__4038$fn__4089.invoke(executor.clj:749)
at backtype.storm.util$async_loop$fn__364.invoke(util.clj:400)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- None
We are running storm-0.9.0_wip21 and I would be glad to provide any other
information.
Also see https://issues.apache.org/jira/browse/STORM-106.
Thanks,
Sushant
--
This message was sent by Atlassian JIRA
(v6.2#6252)