Re: reasoner performance

ajs6f Wed, 08 Jan 2020 13:02:00 -0800

At a high level, it seems just as easy (if not a good bit easier) to do this in 
SPARQL (which can be understood as supporting some simple inferencing, if you 
like to think of it that way). Is it absolutely necessary to do this using 
inferencing? Are you trying to use that because your recent experience has been 
with OWL?


ajs6f

> On Jan 8, 2020, at 5:14 AM, Luis Enrique Ramos García 
> <luisenriqueramos1...@googlemail.com.INVALID> wrote:
> 
> Dear friends,
> 
> Thanks so much for your quick answer,
> 
> At first about our use case, I estimate that we will be working with around
> 100 millions triples at the beginning, thus according to the answer of
> Dave, this size should be manageable by Jena, or I am wrong?, of course
> surely we will grow quickly, and then I think we should have our eyes
> targeted in another stores, as you recommend. I think that this benchmark
> could be a good starting point [1].
> 
> Second, about the reasoning, our task is as follows:
> 
> let us say we have a knowledgebase of people (p1, p2, pn) and friendships
> (f1, f2, fn). Where p1, p2, pn and f1, f2, fn are individuals of the
> respective concepts (people and friendship). People are related by
> friendships, every friendship occurs between two different people, has
> start date, and end date of the friendship, if any, and a validity, this
> validity is a Boolean. In our reasoning we want to get friendly people, and
> for us a friendly person would have more than "X" valid friendships.
> 
> For this, I think i have to follow the following workflow:
> 
> 1. Run a rule to evaluate the friendship validity, triggering it to true or
> false.
> 
> 2. Perform inference on the result to get valid friendship, if any.
> 
> 
> About my third question:
> 
> 3. is necessary a triple store to use with reasoner and rule engine?, in
>> that case what do you recommend?
> 
> my most experience has been with protege, and owl api, and I understood
> that they recommended a back end repository for dealing with large
> datasets, perhaps I misunderstood it, and I know that stores and reasoners
> are different things.
> 
> Well, I thank you in advance all the recommendations you could give me.
> 
> best regards
> 
> 
> Luis Ramos
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> [1] https://www.w3.org/wiki/LargeTripleStores
> 
> 
> El mar., 7 ene. 2020 a las 11:01, Lorenz Buehmann (<
> buehm...@informatik.uni-leipzig.de>) escribió:
> 
>> I agree with Dave, we should start with the most important things:
>> 
>> i) what is the use case
>> ii) what kind of inference is needed here
>> 
>> There is an obvious difference between a full OWL 2 compliant DL
>> reasoner usually using tableau algorithm and a reasoner based on rules.
>> 
>> Most common benchmarks I touched have been LUBM and UOBM to evaluated
>> performance of large scale reasoner usually to some extended related to
>> triple stores (or even integrated)
>> 
>> I'd not go with the map-reduce way, there are already approaches based
>> on Spark and Flink for some (sub)set of OWL/RDFS inference rules. Those
>> tend to be faster due to benefits like in-memory processing especially
>> when iterative algorithms like fix-point etc. come into play.
>> 
>> Anyways, we should start with i) and ii) here.
>> 
>> On 07.01.20 09:59, Dave Reynolds wrote:
>>> On 07/01/2020 08:31, Luis Enrique Ramos García wrote:
>>>> Dear friends,
>>>> 
>>>> I am currently working in an application in where I have to implement a
>>>> reasoner, in which I have had some experience, the difference is that
>>>> this
>>>> time i have to implement it in a big data environment, where I have
>>>> to deal
>>>> with a data set od some giga bytes.
>>>> 
>>>> About that, my questions are the following:
>>>> 
>>>> 1. is there a benchmark or evaluation of performance of jena with some
>>>> reasoners, which consider memory or quantity of triples, and
>>>> execution time?.
>>> 
>>> Depends what sort of inference you are talking about.
>>> 
>>> Apart from the OWL benchmarks you mention, some of the Sparql
>>> benchmarks do require small amounts of reasoning loosely around
>>> RDFS++. For example, I seem to remember LUBM requires this but I've
>>> never worked with it.
>>> 
>>> Jena's inference is not designed to scale to billons of triples, it's
>>> a memory-only solution (though "giga byes" might mean just millions of
>>> triples and might fit in memory). So reasoning at scale benchmarks on
>>> Jena are not going to be much use to you. Look at the results for
>>> commercial stores that do claim inference at scale.
>>> 
>>>> 2. is elephas, and a map reduce approach a good alternative to deal
>>>> with a
>>>> big data environment?
>>> 
>>> Depends what sort of inference you are talking about and whether you
>>> care about latency or just overall throughput at scale. Map reduce is
>>> not good for low latency interactive queries.
>>> 
>>>> 3. is necessary a triple store to use with reasoner and rule engine?, in
>>>> that case what do you recommend?
>>> 
>>> Don't understand the question. Triple stores and reasoners are
>>> different things. You can have reasoners that have nothing to do with
>>> RDF/triple-stores and you can have triple stores with no reasoner.
>>> There are fair number of commercial and open source tools in both
>>> categories and in the overlap.
>>> 
>>> Dave
>> 
>>

Re: reasoner performance

Reply via email to