My intention is not to judge the complexity of counting triples, but to
understand (in this case) how a publisher can estimate the size of the
database when using different rules (part of the ontology). So yes, it's a
real use case for the EU Publications Office (PO) to know how complex can
be to use inferences in their dataset.
And yes, I am evaluating Marklogic and 6 other triple stores to recommend
to the PO based on many requirements, one of them is the capacity to do
inference.

Le mer. 19 juil. 2017 à 12:04, John Snelson <[email protected]> a
écrit :

> In my experience people most often want to know how many triples they have
> as some kind of measure of complexity of the system. It isn't a very good
> measure of this complexity, and in MarkLogic it's complex to calculate.
>
>
> John
>
>
> On 19/07/17 10:58, Ghislain Atemezing-Pro wrote:
>
> John,
> Thanks for your answer.
> However, I don't understand this one "Given this, I would suggest
> that this probably isn't a valuable question to answer. Well, I know I
> don't always have "valuable question", I guess there will be a filter of
> such "non valuable question" here to prevent me sending my questions.
>
> But yes, thanks for your time and sincerity.
>
> Ghislain
>
> Le mer. 19 juil. 2017 à 11:47, John Snelson <[email protected]>
> a écrit :
>
>> On 13/07/17 10:08, Ghislain Atemezing-Pro wrote:
>> > Hi list,
>> > I am trying to combine the function cts:triple-value-statistics with
>> > the predefined rulesets of Marklogic.
>> > I have some 727M triples, with an ontology containing many subclasses
>> > and subproperties.
>> >  I've added 4 predefined rules to see whether I can get some inferred
>> > data, such as rdfs-plus-full, subclassOf, sameAs, inverseOf.
>>
>> Don't use rdfs-plus-full or any of the "*-full" rulesets.
>>
>> > When I use the function  cts:triple-value-statistics, I can't get the
>> > full number of triples with inferred data. Am I doing something wrong?
>> > Is it possible to get that information somehow?
>>
>> In MarkLogic inference happens at query time (backwards chaining) so we
>> don't have database statistics about total number of inferred triples.
>>
>> You can find this information using the count() aggregate in a SPARQL
>> query. It will probably take a really long time, and may fail to
>> complete if it runs out of scratch space. Given this, I would suggest
>> that this probably isn't a valuable question to answer.
>>
>> John
>>
>> --
>> John Snelson, Principal Engineer              http://twitter.com/jpcs
>> MarkLogic Corporation                         http://www.marklogic.com
>> _______________________________________________
>> General mailing list
>> [email protected]
>> Manage your subscription at:
>> http://developer.marklogic.com/mailman/listinfo/general
>>
> --
> --------------------------------------------
> Ghislain A. Atemezing, Ph.D
> R&D Engineer SemWeb
> @ Mondeca, Paris, France
> Labs: http://labs.mondeca.com
> Tel: +33 (0)1 4111 3034 <+33%201%2041%2011%2030%2034>
> Web: www.mondeca.com
> Twitter: @gatemezing
> About Me: http://atemezing.org
>
>
>
> _______________________________________________
> General mailing [email protected]
> Manage your subscription at: 
> http://developer.marklogic.com/mailman/listinfo/general
>
>
> --
> John Snelson, Principal Engineer              http://twitter.com/jpcs
> MarkLogic Corporation                         http://www.marklogic.com
>
> _______________________________________________
> General mailing list
> [email protected]
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
-- 
--------------------------------------------
Ghislain A. Atemezing, Ph.D
R&D Engineer SemWeb
@ Mondeca, Paris, France
Labs: http://labs.mondeca.com
Tel: +33 (0)1 4111 3034
Web: www.mondeca.com
Twitter: @gatemezing
About Me: http://atemezing.org
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to