On 02/12/2020 10:07, Osma Suominen wrote:
Hi Andy!

Andy Seaborne kirjoitti 1.12.2020 klo 23.15:
There is no reason I can see why the special case of exactly one "FROM" can't be handled specially. It masks all named graphs but is a rewrite from triples, that will be fine.

Right. Is this worth opening a JIRA issue?

If you want. It doesn't make it happen though; taht takes coding.


TDB1: 0.825s
TDB2: 1.127s

No real difference to the quad version using GRAPH (see below).

For me:
About 300-400ms is going in the SELECT expression.

Do you mean that it would be possible to further optimize the query while getting the same result? I tested moving the SELECT expression inside the query as a BIND, but it didn't seem to make any difference.

I replaced the SELECT line with a COUNT to see where the time is going.

The query timing has a bug - it does not preload the result set formatting during the warmup so classloading (and JIT but that is a lesser issue) happen during the main execution. Approximately, junk the first timed run.

The rest go up and down because :
(1) other stuff on the machine going tick
(2) bits of successive JIT happening

That gave me an idea. I was wondering why there was such a clear difference between Jena 3.8.0 and 3.16.0 in the initial benchmark (where I used average query times after the warmup), but no difference in the second one (where I used minimum query times).

I decided to look at the time for the first query after the partial warmup - when you said "junk the first timed run" I did exactly the opposite :) So I benchmarked the GRAPH variant of the query, on both TDB1 and TDB2, using all Jena releases from 3.8.0 to 3.16.0. I used --repeat 1,1 and ran all the queries 5 times as separate invocations, noting the minimum time. The results:

Lots of class loading and JIT ...


TDB1

3.8.0: 1.0s
3.9.0: 1.0s
3.10.0: 3.5s
3.11.0: 3.4s
3.12.0: 3.6s
3.13.0: 3.2s
3.14.0: 3.2s
3.15.0: 3.2s
3.16.0: 3.1s

TDB2

3.8.0: 1.4s
3.9.0: 1.4s
3.10.0: 5.4s
3.11.0: 5.7s
3.12.0: 5.6s
3.13.0: 5.9s
3.14.0: 5.4s
3.15.0: 5.6s
3.16.0: 5.7s

Now there is a clear pattern: starting from Jena 3.10.0, the first timed run is much slower. So something happened there that makes the first full query (after the partial warmup) take much longer than it used to - but apparently not the subsequent ones which I timed yesterday.

Any ideas what this could be about? Should this be investigated more?

It would be great if you would.

I have a change lined up to fix "--out none" to be "less optimized"; it currently does nothing (very efficiently), it could consume silently the query results.

It would also be better if the warmup with writing the required format to /dev/null would also be better.

    Andy

I looked at the 3.10.0 release notes but I couldn't see anything obviously related. Of course the release happened almost two years ago, so it's been a while...

-Osma

Reply via email to