On 15/07/16 15:10, Stian Soiland-Reyes wrote:
First of all the sorting could be sped up significantly on a multi-core machine by using Java 8's: http://docs.oracle.com/javase/8/docs/api/java/util/Arrays.html#parallelSort-T:A-
(makes note)
(The speed-up can in some cases be much larger than the multiple of the number of cores)
It might be faster to abandon the sort than make it finish naturally quickly. Testing that would be hard to do fairly. The proposal in my earlier message assumes that the extra layer of delegation in the CanAbortComparator doesn't present a significant overhead in the expected-usual case of running to completion, but I didn't try writing a performance test. (Or any other automated tests for cancellation at this time)
What about instead using a FutureTask which you cancel with future.cancel(true), the task would just call Arrays.parallelSort().
I don't speak FutureTask today, but maybe I will next week ...
The array might be in a weird state if you cancel/interrupt mid-sort (e.g. duplicate values) - but on cancelling you can clear/forget the array. And by keeping the array sorting in a separate thread you won't be interrupting something within Jena's internal (e.g. a TDB index update)
I don't have a good mental model of how Arq/Fuseki handle threading for requests/query(iterator)s. Any advice from those that do? It's something I can see myself getting horribyl wrong ... Chris
On 15 July 2016 at 12:08, Chris Dollin <[email protected]> wrote:Dear All When a query with an ORDER BY is cancelled, the component Arrays.sort() that sorts the chunk(s) of the result bindings runs to completion before the cancel finishes. [See QueryIterSort and SortedDataBag.] For a large result set, this results in a long wait before the cancelled request finally finishes. This can be inconvenient. The cancel request can be sneaked into the sort by way of the comparator [1] and adding an instance variable `cancelled` to SortedDataBag, set `true` from QueryIterSort.requestCancel(). The comparator checks `cancelled` and if it has become `true` throws an exception, which is then caught outside the call to Arrays.sort(), abandoning the sort. See attached diff. Questions arising: * is it safe to abandon a sort from inside a comparator? (can't see anything that suggests otherwise.) * are there threading issues that have to be deal with other than by making the `cancelled` flag volatile? If what I suggest appears to be sane I'll make it a pull request and run the process. Chris [1] Using a wrapper to handle the test for cancellation and then delegating `compare` to the comparator supplied to SortedDataBag.
