On 07/01/2020 08:31, Luis Enrique Ramos García wrote:
Dear friends,

I am currently working in an application in where I have to implement a
reasoner, in which I have had some experience, the difference is that this
time i have to implement it in a big data environment, where I have to deal
with a data set od some giga bytes.

About that, my questions are the following:

1. is there a benchmark or evaluation of performance of jena with some
reasoners, which consider memory or quantity of triples, and
execution time?.

Depends what sort of inference you are talking about.

Apart from the OWL benchmarks you mention, some of the Sparql benchmarks do require small amounts of reasoning loosely around RDFS++. For example, I seem to remember LUBM requires this but I've never worked with it.

Jena's inference is not designed to scale to billons of triples, it's a memory-only solution (though "giga byes" might mean just millions of triples and might fit in memory). So reasoning at scale benchmarks on Jena are not going to be much use to you. Look at the results for commercial stores that do claim inference at scale.

2. is elephas, and a map reduce approach a good alternative to deal with a
big data environment?

Depends what sort of inference you are talking about and whether you care about latency or just overall throughput at scale. Map reduce is not good for low latency interactive queries.

3. is necessary a triple store to use with reasoner and rule engine?, in
that case what do you recommend?

Don't understand the question. Triple stores and reasoners are different things. You can have reasoners that have nothing to do with RDF/triple-stores and you can have triple stores with no reasoner. There are fair number of commercial and open source tools in both categories and in the overlap.

Dave

Reply via email to