[ 
https://issues.apache.org/jira/browse/STORM-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans updated STORM-2017:
---------------------------------------
    Assignee: Lasse Kiviluoto

> ShellBolt stops reporting task ids
> ----------------------------------
>
>                 Key: STORM-2017
>                 URL: https://issues.apache.org/jira/browse/STORM-2017
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 1.0.1, 1.0.3
>            Reporter: Lasse Kiviluoto
>            Assignee: Lasse Kiviluoto
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> After running enough flow throw ShellBolt in some cases after tens of minutes 
> ShellBolt stopped reporting task ids. After this error condition no new task 
> ids where reported back. When acking of the tuples processed by the bolt 
> where set in callback related to arrival of the task ids all tuple trees 
> going through the bolt would fail after reporting stopped. ShellBolt will 
> continue to operate new tuples and respond to heartbeats.
> After running some tests and making some changes to the code. I have 
> following hypothesis for the reason:
> org.apache.storm.utils.ShellBoltMessageQueue has two queues one being for 
> taskIds and the other for bolt messages.
> taskIds queue is implemented by LinkedList and bolt msg queue 
> LinkedBlockingQueue. Both of the queues are operated similarly.
> One major difference between the structures is that LinkedList is not 
> synchronized.
> In the code:
> ShellBoltMessageQueue.java:58 add method is used without holding the lock. 
> Where as ShellBoltMessageQueue.java:110 uses the poll method with the lock. 
> As in ShellBolt BoltReaderRunnable and BoltWriterRunnable are run 
> concurrently this can lead to race condition.
> If I move the ShellBoltMessageQueue.java:58 inside the lock and run the test 
> in similar fashion it seems to solve the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to