I agree with Dave, we should start with the most important things:

i) what is the use case
ii) what kind of inference is needed here

There is an obvious difference between a full OWL 2 compliant DL
reasoner usually using tableau algorithm and a reasoner based on rules.

Most common benchmarks I touched have been LUBM and UOBM to evaluated
performance of large scale reasoner usually to some extended related to
triple stores (or even integrated)

I'd not go with the map-reduce way, there are already approaches based
on Spark and Flink for some (sub)set of OWL/RDFS inference rules. Those
tend to be faster due to benefits like in-memory processing especially
when iterative algorithms like fix-point etc. come into play.

Anyways, we should start with i) and ii) here.

On 07.01.20 09:59, Dave Reynolds wrote:
> On 07/01/2020 08:31, Luis Enrique Ramos García wrote:
>> Dear friends,
>>
>> I am currently working in an application in where I have to implement a
>> reasoner, in which I have had some experience, the difference is that
>> this
>> time i have to implement it in a big data environment, where I have
>> to deal
>> with a data set od some giga bytes.
>>
>> About that, my questions are the following:
>>
>> 1. is there a benchmark or evaluation of performance of jena with some
>> reasoners, which consider memory or quantity of triples, and
>> execution time?.
>
> Depends what sort of inference you are talking about.
>
> Apart from the OWL benchmarks you mention, some of the Sparql
> benchmarks do require small amounts of reasoning loosely around
> RDFS++. For example, I seem to remember LUBM requires this but I've
> never worked with it.
>
> Jena's inference is not designed to scale to billons of triples, it's
> a memory-only solution (though "giga byes" might mean just millions of
> triples and might fit in memory). So reasoning at scale benchmarks on
> Jena are not going to be much use to you. Look at the results for
> commercial stores that do claim inference at scale.
>
>> 2. is elephas, and a map reduce approach a good alternative to deal
>> with a
>> big data environment?
>
> Depends what sort of inference you are talking about and whether you
> care about latency or just overall throughput at scale. Map reduce is
> not good for low latency interactive queries.
>
>> 3. is necessary a triple store to use with reasoner and rule engine?, in
>> that case what do you recommend?
>
> Don't understand the question. Triple stores and reasoners are
> different things. You can have reasoners that have nothing to do with
> RDF/triple-stores and you can have triple stores with no reasoner.
> There are fair number of commercial and open source tools in both
> categories and in the overlap.
>
> Dave

Reply via email to