On 11/10/11 08:12, Paolo Castagna wrote:
Hi Andy,
are you planning to put a few datasets in SVN together with the queries in 
JenaPerf?

I saw a data directory for LUBM but not data in it:
https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk/Benchmarks/LUBM/Data/

 From a user perspective it would be great to just do:

   svn co 
https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk 
JenaPerf
   cd JenaPerf
   ./run

Installing any of LUBM, BSBM or SP2B (although not incredibly complicate) isn't 
trivial.

LUBM: The generator and test driver code is GPL. The queries I have are taken from the published paper, translated by me to SPARQL so can they be distributed. Data can be generated.

BSBM: The queries are actually templates and instantiated at runtime using a configuration file which is generated when the data is generated. Generating data isn't just creating RDF triples.

The queries templates exist in the code base (bsbmtools on SF). I have been talking to the creators and the license has changed from GPL to AL (thanks guys). So it will be possible to include queries from the codebase - the templating will have to be written. (the license change affects JenaPerf becuase it is redistributing, unlike downloading and running).

SP2B is published under BSD.

        Andy

 From a community and project perspective, it's quite good and helpful
to have a standard set of datasets. Although, I realize that if datasets
are not small, it might take a while to download them.

Can we use .gz datasets with JenaPerf?

We could also include small-medium size dataset together with JenaPerf
and have a separate checkout/download for larger ones.

What do you think?

Paolo

Reply via email to