Replying to myself, as I did some follow-up tests.
Osma Suominen kirjoitti 4.12.2020 klo 18.42:
Now this turned into a rather interesting exercise in using git bisect.
I was able to track down the change that caused the slowdown. It's this
merge commit:
[f93fdbad7aa8d6ddb46693395e3bfb5ea487bf16] JENA-1648: Merge commit
'refs/pull/507/head' of https://github.com/apache/jena
which refers to this pull request:
https://github.com/apache/jena/pull/507
I don't have time for very deep analysis right now but it doesn't
surprise me that a substantial change to the query result serialization
slows down the queries.
Things to check: (mostly as a TODO list for myself)
1. Does this depend on the query result format? For example, is only the
text format (default) slower than before?
2. Is there something suspicious in the PR 507 code that would explain
why it's so much slower?
This affects at least the CSV format too, so it's not just the text
output format.
But I figured out that the real change here is simply that the warmup
performed when using the --repeat parameter with two arguments has
become less effective starting with Jena 3.10.0. When no warmup is used,
the performance is the same for the different Jena versions.
And now that Andy implemented JENA-2007 which improves the warmup, I
think the problem has already been solved.
Case closed.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 15 (Unioninkatu 36)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi