[
https://issues.apache.org/jira/browse/JENA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085376#comment-13085376
]
Simon Helsen commented on JENA-100:
-----------------------------------
thanks Stephen. In https://issues.apache.org/jira/browse/JENA-93 I noticed that
cancelAllowContinue sometimes throws an exception which it shouldn't. There may
be a relation to what you found here. Once your patch makes it into a build,
I'll test if it resolves the problem I observed.
re: complexity. Yes, I know because in the patch I original submitted for this
feature, I had to introduce a multi-staged cancellation mechanism to gracefully
allow the drain in a multi-threaded environment. It went something like this:
when a cancel() is invoked, notify the iterator stack the cancel has been
requested, but do not cancel just yet (!) Instead, all iterators have to be
able to continue until the first execution of next() finishes. It is not
sufficient to have hasNext() return false immediately because in a
multi-threaded world, you may end up in a situation where a call to hasNext
succeeded, but next() was not yet executed (and it usually calls hasNext under
the hood again). So, after the first next() was retrieved, the cancel becomes
"active" (stage 2) and hasNext() will start returning false.
This 2-staged cancellation was applied to ARQ in our released product and works
very reliably. I hope cancelAllowContinue() can be implemented in a similar
manner. It would hinder us from upgrading to a later ARQ/TDB
> QueryIteratorBase concurrency issues
> ------------------------------------
>
> Key: JENA-100
> URL: https://issues.apache.org/jira/browse/JENA-100
> Project: Jena
> Issue Type: Bug
> Components: ARQ
> Reporter: Stephen Allen
> Attachments: JENA-100-r1157891.patch
>
>
> QueryIteratorBase appears to have some concurrency bugs relating to
> cancelling a query:
> 1) The cancel() and cancelAllowContinue() methods did not have large enough
> synchronized blocks (they also used "this" as the lock, which is not
> recommended)
> 2) The order of setting the cancellation flags and the notifying subclasses
> via the requestCancel() method was incorrect
> 3) The visibility and happens-before relationships were incorrect for the
> requestingCancel and abortIterator variables
> The cancelAllowContinue() feature adds a lot of complexity in terms of
> visibility and ordering. Unfortunately it is hard to write test cases for
> these types of concurrency issues, so the existing tests did not uncover the
> issues.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira