[ 
https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085376#comment-13085376
 ] 

Simon Helsen commented on JENA-100:
-----------------------------------

thanks Stephen. In https://issues.apache.org/jira/browse/JENA-93 I noticed that 
cancelAllowContinue sometimes throws an exception which it shouldn't. There may 
be a relation to what you found here. Once your patch makes it into a build, 
I'll test if it resolves the problem I observed.

re: complexity. Yes, I know because in the patch I original submitted for this 
feature, I had to introduce a multi-staged cancellation mechanism to gracefully 
allow the drain in a multi-threaded environment. It went something like this: 
when a cancel() is invoked, notify the iterator stack the cancel has been 
requested, but do not cancel just yet (!) Instead, all iterators have to be 
able to continue until the first execution of next() finishes. It is not 
sufficient to have hasNext() return false immediately because in a 
multi-threaded world, you may end up in a situation where a call to hasNext 
succeeded, but next() was not yet executed (and it usually calls hasNext under 
the hood again). So, after the first next() was retrieved, the cancel becomes 
"active" (stage 2) and hasNext() will start returning false.

This 2-staged cancellation was applied to ARQ in our released product and works 
very reliably. I hope cancelAllowContinue() can be implemented in a similar 
manner. It would hinder us from upgrading to a later ARQ/TDB

> QueryIteratorBase concurrency issues
> ------------------------------------
>
>                 Key: JENA-100
>                 URL: https://issues.apache.org/jira/browse/JENA-100
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ
>            Reporter: Stephen Allen
>         Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to 
> cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough 
> synchronized blocks (they also used "this" as the lock, which is not 
> recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses 
> via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the 
> requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of 
> visibility and ordering.  Unfortunately it is hard to write test cases for 
> these types of concurrency issues, so the existing tests did not uncover the 
> issues.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to