On 15/10/11 18:46, Paolo Castagna wrote:
Andy Seaborne wrote:
On 11/10/11 08:12, Paolo Castagna wrote:
Hi Andy,
are you planning to put a few datasets in SVN together with the
queries in JenaPerf?
I saw a data directory for LUBM but not data in it:
https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk/Benchmarks/LUBM/Data/
From a user perspective it would be great to just do:
svn co
https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk
JenaPerf
cd JenaPerf
./run
Installing any of LUBM, BSBM or SP2B (although not incredibly
complicate) isn't trivial.
LUBM: The generator and test driver code is GPL. The queries I have are
taken from the published paper, translated by me to SPARQL so can they
be distributed. Data can be generated.
Could we generate some datasets using LUBM and make them available somewhere
in SVN to checkout together with JenaPerf so that users wanting to run LUBM
via JenaPerf do not need to generate data using LUBM themselves?
It's more complicated : some of the benchmarks are pointless when small.
svn is not the place for gigabyte binaries. We could put data up
somewhere in file space.
Is people.apache.org the place? http://people.apache.org/~andy/RDF_Data/
Is there another filespace for projects in Apache?
It takes hours to upload the files. And don't forget about files >4G.
Or, if adding datasets to JenaPerf causes problem from a licensing point of
view, could we generate a few LUBM datasets and made them available to download
somewhere else and make JenaPerf downloading them when you need to run LUBM
queries?
No licensing issues I can see. It's the generator java code that
explicitly states it's GPL.
You are running on a GPL OS. Does that make the output of "echo foo"
GPL? What about the output of gcc(1)?
The reason why I'd like to add datasets in addition to the queries is to make
life easier for users. It would be good to just checkout/download JenaPerf and
run it without the need to install LUBM, BSBM and/or SP2B and generate datasets
using those.
Checking out from svn is not the way to deliver for convenient use.
Much better to release precompiled binaries because they may be ruuning
tests on a machine without javacc. (Or scala)
Andy