Hi all, 

We are now benchmarking several triple stores that support inference through 
forward chaining against a system that does a particular form of query 
rewriting.

The benchmark we are using is simple, an extended version of LUBM, using big 
datasets
LUBM 1000, 8000, 15000, 250000. From Jena we would like to benchmark loading 
time, inference time and query answering time, using both TDB and SDB. 
Inferences should be done with limited amounts of memory, the less the better. 
However, we are having difficulties understanding what is the fair way to do 
this. Also, the system used for this benchmarks should be a simple system, not 
a cluster or a server with large resources. We would like to ask the community 
for help to approach this in the best way possible. Hence this email :). Here 
go some questions and ideas.

Is it the case that the default inference engine of Jena requires all triples 
to be
in-memory? Is it not possible to do this on this? If this is so, what would be 
the fair way to benchmark the system? Right now
we are thinking of a workflow as follows:

1. Start a TDB or SDB store.
2. Load 10 LUBMS in memory, compute the closure using 

Reasoner reasoner = ReasonerRegistry.getOWLReasoner();
InfModel inf = ModelFactory.createInfModel(reasoner, monto, m);

and storing the result in SDB or TDB. When finished, 
3. Query the store directly.

Is this the most efficient way to do it? Are there important parameters 
(besides the number of universities used in the computation of the closure) 
that we should tune to guarantee a fair evaluation? Are there any documents 
that we could use to guide ourselfs during tuning of Jena?

Thank you very much in advance everybody,

Best regards,
Mariano



Mariano Rodriguez Muro                
http://www.inf.unibz.it/~rodriguez/   
KRDB Research Center                  
Faculty of Computer Science           
Free University of Bozen-Bolzano (FUB)
Piazza Domenicani 3,                  
I-39100 Bozen-Bolzano BZ, Italy       
猴




Reply via email to