Re: JenaPerf and datasets...

Andy Seaborne Sat, 15 Oct 2011 09:26:42 -0700

On 11/10/11 08:12, Paolo Castagna wrote:

Hi Andy,
are you planning to put a few datasets in SVN together with the queries in 
JenaPerf?


I saw a data directory for LUBM but not data in it:
https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk/Benchmarks/LUBM/Data/

 From a user perspective it would be great to just do:

   svn co 
https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk 
JenaPerf
   cd JenaPerf
   ./run

Installing any of LUBM, BSBM or SP2B (although not incredibly complicate) isn't 
trivial.

LUBM: The generator and test driver code is GPL. The queries I have aretaken from the published paper, translated by me to SPARQL so can theybe distributed. Data can be generated.

BSBM: The queries are actually templates and instantiated at runtimeusing a configuration file which is generated when the data isgenerated. Generating data isn't just creating RDF triples.

The queries templates exist in the code base (bsbmtools on SF). I havebeen talking to the creators and the license has changed from GPL to AL(thanks guys). So it will be possible to include queries from thecodebase - the templating will have to be written. (the license changeaffects JenaPerf becuase it is redistributing, unlike downloading andrunning).


SP2B is published under BSD.

        Andy

 From a community and project perspective, it's quite good and helpful
to have a standard set of datasets. Although, I realize that if datasets
are not small, it might take a while to download them.

Can we use .gz datasets with JenaPerf?

We could also include small-medium size dataset together with JenaPerf
and have a separate checkout/download for larger ones.

What do you think?

Paolo

Re: JenaPerf and datasets...

Reply via email to