On 15/07/16 15:10, Stian Soiland-Reyes wrote:
First of all the sorting could be sped up significantly on a
multi-core machine by using Java 8's:
http://docs.oracle.com/javase/8/docs/api/java/util/Arrays.html#parallelSort-T:A-

(makes note)

(The speed-up can in some cases be much larger than the multiple of
the number of cores)

It might be faster to abandon the sort than make it finish naturally
quickly. Testing that would be hard to do fairly. The proposal
in my earlier message assumes that the extra layer of delegation
in the CanAbortComparator doesn't present a significant overhead
in the expected-usual case of running to completion, but I didn't
try writing a performance test.

(Or any other automated tests for cancellation at this time)

What about instead using a FutureTask which you cancel with
future.cancel(true), the task would just call Arrays.parallelSort().

I don't speak FutureTask today, but maybe I will next week ...

The array might be in a weird state if you cancel/interrupt mid-sort
(e.g. duplicate values) - but on cancelling you can clear/forget the
array. And by keeping the array sorting in a separate thread you won't
be interrupting something within Jena's internal (e.g. a TDB index
update)

I don't have a good mental model of how Arq/Fuseki handle threading
for requests/query(iterator)s. Any advice from those that do? It's
something I can see myself getting horribyl wrong ...

Chris

On 15 July 2016 at 12:08, Chris Dollin <[email protected]> wrote:
Dear All

When a query with an ORDER BY is cancelled, the component
Arrays.sort() that sorts the chunk(s) of the result
bindings runs to completion before the cancel finishes.
[See QueryIterSort and SortedDataBag.]

For a large result set, this results in a long wait
before the cancelled request finally finishes. This
can be inconvenient.

The cancel request can be sneaked into the sort by
way of the comparator [1] and adding an instance
variable `cancelled` to SortedDataBag, set `true`
from QueryIterSort.requestCancel(). The comparator
checks `cancelled` and if it has become `true`
throws an exception, which is then caught outside
the call to Arrays.sort(), abandoning the sort.
See attached diff.

Questions arising:

* is it safe to abandon a sort from inside a comparator?
  (can't see anything that suggests otherwise.)

* are there threading issues that have to be deal with
  other than by making the `cancelled` flag volatile?

If what I suggest appears to be sane I'll make it a
 pull request and run the process.

Chris

[1] Using a wrapper to handle the test for cancellation
    and then delegating `compare` to the comparator
    supplied to SortedDataBag.


Reply via email to