[ 
https://issues.apache.org/jira/browse/DRILL-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Westin updated DRILL-2502:
--------------------------------
    Summary: Improve Foreman code safety by providing a generic event delivery 
mechanism that supports queueing  (was: Improve code safety by providing a 
generic event delivery mechanism that supports queueing)

> Improve Foreman code safety by providing a generic event delivery mechanism 
> that supports queueing
> --------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-2502
>                 URL: https://issues.apache.org/jira/browse/DRILL-2502
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>            Reporter: Chris Westin
>            Assignee: Jacques Nadeau
>             Fix For: 0.8.0
>
>         Attachments: DRILL-2502.1.patch.txt
>
>
> While working on DRILL-2245, I ran into problems because 
> Foreman.moveToState() can be called recursively. Being synchronized doesn't 
> prevent the same thread from calling back into this. When this happens, state 
> transitions are either dropped or broken depending on when this happens. This 
> event really needs to be handled atomically, but when 
> queryContext.cancelExecutingFragments() is called, the completion of the 
> cancellations results in QueryManager.fragmentDone() being called, and it 
> indiscriminately sends the COMPLETED state, even though we may be in the 
> middle of handling a move to FAILED or CANCELLATION_REQUESTED. Unless the 
> target state is first recorded before any possible recursive call happens, 
> the recursive transition is recorded, and the second transition doesn't 
> result in any message being sent to the user -- the client hangs.
> It's difficult to reason about when its safe for this kind of recursive call 
> to happen, and even more difficult to avoid inadvertently introducing these. 
> Problems happen when the delivery of an event to a listener causes cascaded 
> events that eventually deliver to the same listener.
> In order to avoid this, we should create (and use) a generic class that can 
> receive such events. If nothing is happening when one is received, it is 
> processed immediately, in line. If an event is received when we're in the 
> middle of processing another event, the newly received one is queued. When 
> the current event processing is completed, we then go through any queued 
> events, and process them one by one. We continue in this way until there are 
> no more events left, then return. This mechanism would improve code safety by 
> preventing the delivery of events while in the middle of processing other 
> events by the same handler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to