[ 
https://issues.apache.org/jira/browse/JENA-29?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-29.
-------------------------------

    Resolution: Fixed

Leaving sub-task for QueryEngineHTTP open

> cancellation during query execution
> -----------------------------------
>
>                 Key: JENA-29
>                 URL: https://issues.apache.org/jira/browse/JENA-29
>             Project: Jena
>          Issue Type: Improvement
>          Components: ARQ, TDB
>            Reporter: Simon Helsen
>            Assignee: Andy Seaborne
>         Attachments: JENA-29_ARQ_r8489.patch, JENA-29_TDB_r8489.patch, 
> JENA-29_tests_ARQ_r8489.patch, cancelFix3.patch, jena.patch, 
> jenaAddition.patch, queryIterRepeatApply.patch
>
>
> The requested improvement and proposed patch is made by Simon Helsen on 
> behalf of IBM
> ARQ query execution currently does not have a satisfactory way to cancel a 
> running query in a safe way. Moreover, cancel (unlike a hard abort) is 
> especially useful if it is able to provide partial result sets (i.e. all the 
> results it managed to compute up to when the cancellation was requested). 
> Although the exact cancellation behavior depends on the capabilities of the 
> underlying triple store, the proposed patch merely relies on the iterators 
> used by ARQ.
> Here is a more detailed explanation of the proposed changes:
> 1) the cancel() method in the QueryIterator initiates a cancellation request 
> (first boolean flag). In analogy with closeIterator(), it propagates through 
> all chained iterators, so the entire calculation is aware that a cancellation 
> is requested
> 2) to ensure a thread-safe semantics, the cancelRequest becomes a real cancel 
> once nextBinding() has been called. It sets the second boolean which is used 
> in hasNext(). This 2-phase approach is critical since the cancel() method can 
> be called at any time during a query execution by the external thread. And 
> because the behavior of hasNext() is such that it has to return the *same* 
> value until next() is called, this is the only way to guarantee semantic 
> safety when cancel() is invoked (let me re-phrase this: it is the only way I 
> was able to make it actually work)
> 3) cancel() does not close anything since it allows execution to finish 
> normally and the client is responsible to call close() just like with a 
> regular execution. Note that the client has to call cancel() explicitly 
> (typically in another thread) and has to assume that the returning result set 
> may be incomplete if this method is called (it is undetermined whether the 
> result is _actually_ incomplete)
> 4) in order to deal with order-by and groups, I had to make two more changes. 
> First, I had to make QueryIterSort and QueryIterGroup a slightly bit more 
> lazy. Currently, the full result set is calculated during plan calculation. 
> With my proposed adjustments, this full result set is called on the first 
> call to any of its Iterator methods (e.g. hasNext). This change does not 
> AFAIK affect the semantics. Second, because the desired behavior of 
> cancelling a sort or group query is to make sure everything is sorted/grouped 
> even if the total result set is not completed, I added an exception which 
> reverses the cancellation request of the encompassing iterator (as an example 
> see cancel() in QueryIterSort). This makes sure that the entire subset of 
> found and sorted elements is returned, not just the first element. However, 
> it also implies in the case of sort that when a query is cancelled, it will 
> first sort the partially complete result set before returning to the client.
> the attached patch is based on ARQ 2.8.5 (and a few classes in TDB 0.8.7 -> 
> possibly the other triple store implementations need adjustement as well)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to