Re: reasoner performance

ajs6f Fri, 10 Jan 2020 13:49:38 -0800

Good luck! Please do let us know how you get on.

ajs6f


> On Jan 9, 2020, at 2:16 AM, Luis Enrique Ramos García 
> <luisenriqueramos1...@googlemail.com.INVALID> wrote:
> 
> Dear friends,
> 
> given that my larger experience is with OWL; swrl, and swqrl, that was my
> first option, my challenge is to get the job done, the approach to follow
> now is not the most important, so I am going to write the queries in
> sparql, and execute them in my workflow, if I find any constrain in my
> process I will keep you inform.
> 
> best regards
> 
> 
> Luis Ramos
> 
> El mié., 8 ene. 2020 a las 22:01, ajs6f (<aj...@apache.org>) escribió:
> 
>> At a high level, it seems just as easy (if not a good bit easier) to do
>> this in SPARQL (which can be understood as supporting some simple
>> inferencing, if you like to think of it that way). Is it absolutely
>> necessary to do this using inferencing? Are you trying to use that because
>> your recent experience has been with OWL?
>> 
>> ajs6f
>> 
>>> On Jan 8, 2020, at 5:14 AM, Luis Enrique Ramos García
>> <luisenriqueramos1...@googlemail.com.INVALID> wrote:
>>> 
>>> Dear friends,
>>> 
>>> Thanks so much for your quick answer,
>>> 
>>> At first about our use case, I estimate that we will be working with
>> around
>>> 100 millions triples at the beginning, thus according to the answer of
>>> Dave, this size should be manageable by Jena, or I am wrong?, of course
>>> surely we will grow quickly, and then I think we should have our eyes
>>> targeted in another stores, as you recommend. I think that this benchmark
>>> could be a good starting point [1].
>>> 
>>> Second, about the reasoning, our task is as follows:
>>> 
>>> let us say we have a knowledgebase of people (p1, p2, pn) and friendships
>>> (f1, f2, fn). Where p1, p2, pn and f1, f2, fn are individuals of the
>>> respective concepts (people and friendship). People are related by
>>> friendships, every friendship occurs between two different people, has
>>> start date, and end date of the friendship, if any, and a validity, this
>>> validity is a Boolean. In our reasoning we want to get friendly people,
>> and
>>> for us a friendly person would have more than "X" valid friendships.
>>> 
>>> For this, I think i have to follow the following workflow:
>>> 
>>> 1. Run a rule to evaluate the friendship validity, triggering it to true
>> or
>>> false.
>>> 
>>> 2. Perform inference on the result to get valid friendship, if any.
>>> 
>>> 
>>> About my third question:
>>> 
>>> 3. is necessary a triple store to use with reasoner and rule engine?, in
>>>> that case what do you recommend?
>>> 
>>> my most experience has been with protege, and owl api, and I understood
>>> that they recommended a back end repository for dealing with large
>>> datasets, perhaps I misunderstood it, and I know that stores and
>> reasoners
>>> are different things.
>>> 
>>> Well, I thank you in advance all the recommendations you could give me.
>>> 
>>> best regards
>>> 
>>> 
>>> Luis Ramos
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> [1] https://www.w3.org/wiki/LargeTripleStores
>>> 
>>> 
>>> El mar., 7 ene. 2020 a las 11:01, Lorenz Buehmann (<
>>> buehm...@informatik.uni-leipzig.de>) escribió:
>>> 
>>>> I agree with Dave, we should start with the most important things:
>>>> 
>>>> i) what is the use case
>>>> ii) what kind of inference is needed here
>>>> 
>>>> There is an obvious difference between a full OWL 2 compliant DL
>>>> reasoner usually using tableau algorithm and a reasoner based on rules.
>>>> 
>>>> Most common benchmarks I touched have been LUBM and UOBM to evaluated
>>>> performance of large scale reasoner usually to some extended related to
>>>> triple stores (or even integrated)
>>>> 
>>>> I'd not go with the map-reduce way, there are already approaches based
>>>> on Spark and Flink for some (sub)set of OWL/RDFS inference rules. Those
>>>> tend to be faster due to benefits like in-memory processing especially
>>>> when iterative algorithms like fix-point etc. come into play.
>>>> 
>>>> Anyways, we should start with i) and ii) here.
>>>> 
>>>> On 07.01.20 09:59, Dave Reynolds wrote:
>>>>> On 07/01/2020 08:31, Luis Enrique Ramos García wrote:
>>>>>> Dear friends,
>>>>>> 
>>>>>> I am currently working in an application in where I have to implement
>> a
>>>>>> reasoner, in which I have had some experience, the difference is that
>>>>>> this
>>>>>> time i have to implement it in a big data environment, where I have
>>>>>> to deal
>>>>>> with a data set od some giga bytes.
>>>>>> 
>>>>>> About that, my questions are the following:
>>>>>> 
>>>>>> 1. is there a benchmark or evaluation of performance of jena with some
>>>>>> reasoners, which consider memory or quantity of triples, and
>>>>>> execution time?.
>>>>> 
>>>>> Depends what sort of inference you are talking about.
>>>>> 
>>>>> Apart from the OWL benchmarks you mention, some of the Sparql
>>>>> benchmarks do require small amounts of reasoning loosely around
>>>>> RDFS++. For example, I seem to remember LUBM requires this but I've
>>>>> never worked with it.
>>>>> 
>>>>> Jena's inference is not designed to scale to billons of triples, it's
>>>>> a memory-only solution (though "giga byes" might mean just millions of
>>>>> triples and might fit in memory). So reasoning at scale benchmarks on
>>>>> Jena are not going to be much use to you. Look at the results for
>>>>> commercial stores that do claim inference at scale.
>>>>> 
>>>>>> 2. is elephas, and a map reduce approach a good alternative to deal
>>>>>> with a
>>>>>> big data environment?
>>>>> 
>>>>> Depends what sort of inference you are talking about and whether you
>>>>> care about latency or just overall throughput at scale. Map reduce is
>>>>> not good for low latency interactive queries.
>>>>> 
>>>>>> 3. is necessary a triple store to use with reasoner and rule engine?,
>> in
>>>>>> that case what do you recommend?
>>>>> 
>>>>> Don't understand the question. Triple stores and reasoners are
>>>>> different things. You can have reasoners that have nothing to do with
>>>>> RDF/triple-stores and you can have triple stores with no reasoner.
>>>>> There are fair number of commercial and open source tools in both
>>>>> categories and in the overlap.
>>>>> 
>>>>> Dave
>>>> 
>>>> 
>> 
>>

Re: reasoner performance

Reply via email to