(a bit of a "thinking out loud" message - feel free to comment)
I find myself wanting to test out ideas and wanting to quickly and
easily run performance tests to see if some change makes an observable
difference. Such changes may be small (e.g. a different implementation
of Binding) or they may be significant (e.g. paged radix trees for
indexing). At the moment, just setting up the test can be time consuming.
There are various RDF benchmarks but each is a standalone system and
each generates results in it's own format. It's hard enough to remember
how to get each one running because each is different.
Wouldn't it be nice if there was standard framework for running
performance tests?
It would make it quicker to develop new benchmarks since every benchmark
has a target scenario in mind and outside that scenario it's hard to get
much insight from the results.
It would be quicker to just run on different setups, and not have to
port each of the various existing benchmarks.
So the framework is:
+ an environment to run performance tests
+ a library of pre-developed/contributed tests
(data + query mix scripts , drivers for specific systems)
+ common output formats
+ common ways to report results (spreadsheets, graphing tools)
+ documented and tested.
It could also be used for performance regression testing.
To this end, I've started a new module in SVN Experimental/JenaPerf. At
the moment it is no more than some small doodlings to get the classes
and structure right (it's in Scala). On the principle of start small
and iterate, it's going to be for SPARQL query tests first for something
easy, and generating CSV files of results. Scripting wil be "unsubtle"
:-) Plugging-in runtime analysis tools is for "much later".
The other thing it would be good to have is a common public copy of the
various datasets so exactly the same data can be used in different
places. There is a time and place for randomized data creation ... and
a time and place for fixed data.
As these are quite big, SVN (or etc) is not the place and not the focus
of Apache CMS either.
For now, I'll rsync what I have to appear in
http://people.apache.org/andy/RDF_Data. (This will have to be done
incrementally during "off peak" hours because it does rather use up all
the upstream bandwidth and my colleagues or family, depending on
location, might like to use some of it as well.)
What are good benchmarks? The usual suspects are LUBM, BSBM, SP2B. We
are interested in one that is centred queries and query patterns we see
for linked data applications.
Any other resources to draw on?
JUnitPerf (and things of that ilk) assume you are writing tests in code.
The frameworks seem to provide little because the bulk of the work is
writing the SPARQL specific code. Useful learnings though.
The SPARQL-WG tests are scripted (in RDF) and that has worked very well.
Andy