Re: [Discuss] Apache Portable Uniform RDF Runtime (PURR)

Reto Bachmann-Gmür Fri, 23 Nov 2012 02:50:24 -0800

Andy,

The ide in clerezza is to have a minimal core API which is close to what is
the SPI in Jena. There we stick to spec in issues like litera as subject
but I think there are good resons in generalizing. This API is designed to
be very easy to implement. Applications would typically user rdf.utils or
rdf.utils.scala which offers a richer (and resource oriented api). So I
think the goald of purr and rdf.core are very similar.


The security stuff is just stnadrd java permssions. Like the
java.iopackage provides FilePermission the "java.rdf" package should
provide
GraphPermission. This has no other consequences on the interfaces and its
up to the implementation if they check permissions or not.

In clerezza we put a strong emphasis on identity criteria. This is also a
reason why no factories are part of the API. Two nodes are identical and
can be used interchangeably iff the are equals according to the relevant
specs, so it shall not matter if you got your instance from a factory or
implemented the interface yourself, the two instances behave the same in
all contexts. Identity was also the reason for having the distinction
between immutable and mutable graphs. The RDF specification define when two
graphs are equals (they are if they are isomorphic) but this criterion can
only be matched to the Object.equals if the graph aren't mutable (as you
otherwise run into big problems).

Reto

On Thu, Nov 22, 2012 at 9:55 PM, Andy Seaborne <[email protected]> wrote:

> Reto,
>
> There's lots to be learnt from Clerezza.  Clerezza is a "presentation API"
> - it is aimed at giving applications a programming model.  That it also
> claims to encapsulate other systems is, to the application code, secondary
> - users adopt the Clerezza API in their applications and all it's decisions
> e.g two forms of graph, mutable and imutable as part of the type system.
> Clerezza is stateful, has it's own access permission management model,
> priovides OSGi, and tries to map both ways - it use Jena as an
> implementation of Clerezza and also can expose Clerezza as a Jena facade.
>
> I think multiple presentation APIs is healthy.
>
> PURR is lower level. PURR is trying to be simple.  Too complicated (=
> large) and it will not make progress as too many decisions need to made.
> Instead, focus on the one task of being able to have a narrow interface for
> systems like parsers and be a target for presentation APIs.  PURR can be as
> simple as a name mapping layer, if implemented natively; if done
> non-natively, is a single-copy, no state layer.
>
> I'd expect that (in theory) Clerezza could be written over PURR and not
> need to manage the multiple backends itself.  PURR does not provide the
> variation of graphs that Clerezza does, and as I hope is clear from the
> sketch, and instead of deciding whether, say, literals-as-subjects are in
> or out, it takes a neutral/general approach.
>
> (I say "in theory" because (1) there would have to be real value to
> switching and it's not clear to me there is and (2) PURR is small so there
> might be other things not covered Clerezza would want to expose.)
>
>         Andy
>
>
> On 22/11/12 15:20, Reto Bachmann-Gmür wrote:
>
>> Very glad you've started such a uniformications discussion on the jena
>> mailing list. I think it would be good to have such an API adopted by Jena
>> as well. I think it would be important for such an API to be based on
>> standards and not specifically on triple-stor design as to allow exposing
>> other object structures as well as RDF through this API.
>>
>> A couple of thoughts:
>>
>> - To keep the API simple I think it should either be quads or datasets.
>> I'd
>> go for DataSets as this is part of Standards (Sparql) and see quads as a
>> way this can be implemented.
>> - Given a DataSet, why not allow Sparql queries against it? (With an
>> abstract implementation that locates a query engine and if no such engine
>> is found throws a NoQueryEngineFoundException)
>>
>> Apart from the above differences, is there any part of the clerezza rdf
>> api
>> you would implement fundamentally different (I agree the naming should be
>> revisited) than in the Clerezza api [1]?
>>
>> Cheers,
>> Reto
>>
>> http://incubator.apache.org/**clerezza/mvn-site/org.apache.**
>> clerezza.rdf.core/apidocs/**index.html<http://incubator.apache.org/clerezza/mvn-site/org.apache.clerezza.rdf.core/apidocs/index.html>
>>
>>
>> On Thu, Nov 22, 2012 at 4:04 PM, Andy Seaborne <[email protected]> wrote:
>>
>>  Rob's comments on inverting the reader process [2] suggest to me pulling
>>> out an API and I wonder if we can identify a portability layer that
>>> enables
>>> some (not all) interoperability and mix-n-match.
>>>
>>> The term "API" is creating some confusion in the discussions triggered by
>>> the Clerezza incubator project being noted [1][3] as "low activity"
>>>
>>> To some, it's what the application sees -- a presentation API.  To others
>>> it is some kind of abstraction between machinery like storage, inference,
>>> parsing and writing.  They don't have to be the same.
>>>
>>> Even if the only outcome if parser and stream processing mouldarity, I
>>> think it is worth doing. Just being able to add an external "parser" to
>>> Jena in a cleaner way that is currently possible is useful.
>>>
>>> ** Apache Portable Uniform RDF Runtime (PURR) **
>>>
>>> (OK - the "U" is a bit forced :-)
>>>
>>> To me, what we need is an abstraction that allows multiple
>>> implementations
>>> by swapping the jars (or OGSi bundles).  So PURR is a set of interfaces.
>>>   No state.  c.f. SLJ4J.
>>>
>>> There would be many presentation APIs: Model-like, RDF-ORM, Ontology, and
>>> also for natural use in other JVM-based languages - Scala, Clojure,
>>> whatever is the next JVM language de jour.
>>>
>>> It's not a full application library. It's rather low level.  Writing much
>>> code directly at the interface may not be pretty.
>>>
>>> This is not the Jena graph SPI although that was trying to preform that
>>> purpose but has wider coverage of functionality.  I think we can go more
>>> minimal yet.
>>>
>>> The Jena Graph SPI has a number of handlers - events, stats, transactions
>>> - which seem to make the problem too large.  These would be part of
>>> another
>>> subsystem ("extends PURR") and be different in different providers.  One
>>> of
>>> those would be Jena Graph.
>>>
>>> PURR would provide the basic concepts from RDF:
>>>
>>> Terms: IRIs, Literals, bNodes
>>> Triples and Quads
>>> Graph, Dataset
>>> Factories for each.
>>>
>>> and for each be quite vanilla.
>>>
>>> e.g. a literal is a lexical form, a datatype and an optional language
>>> tag.
>>>   Immutable with getters.  Structural equality.  No value, XSD or
>>> otherwise.
>>>
>>> (I'll do a quick sketch in another message - but don't read it as fixed,
>>> just a concrete discussion point)
>>>
>>> Parsers:
>>>
>>> Parsers, and in the general sense of anything that produced RDF from
>>> whatever input, be it an RDF syntax or mapping another data format (a
>>> conversion process), need and input stream and a factory, and emits
>>> Triples, Quads comprising of terms.  That don't need a full "graph" -
>>> they
>>> need a destination to send Triples/Quads (or be pull parsers).
>>>
>>> Writers:
>>>
>>> Writing is not the reverse of parsing - parsers produce a stream, writers
>>> for Turtle etc need to poke around the graph to decide what will "look
>>> nice".  Even N-triples written clustered by subject can be useful.
>>>
>>> Negatives:
>>>
>>> 1/ It's wildly ambitious and impractical to even consider portability and
>>> abstraction.  Too much time has passed.  Waste of effort.
>>>
>>> 2/ The portability layer is so narrow that it is not helpful.
>>>
>>> 3/ No SPARQL.
>>> (counter: (1) SPARQL is a remote protocol - this is same-JVM).
>>> (counter: (2) develop a SPARQL API using PURR basic terms)
>>>
>>>
>>> Opinions?
>>>
>>>          Andy
>>>
>>>
>>> [1] 
>>> http://wiki.apache.org/****incubator/November2012<http://wiki.apache.org/**incubator/November2012>
>>> <http://**wiki.apache.org/incubator/**November2012<http://wiki.apache.org/incubator/November2012>
>>> >
>>>
>>> [2] http://s.apache.org/KCv
>>> -->
>>> http://mail-archives.apache.****org/mod_mbox/jena-dev/201210.***
>>> *mbox/%**
>>> 3CC0B6979A3CA668458B697E4EA907****CA940A01AF5C%40CFWEX01.**
>>> americas.cray.com%3E<http://**mail-archives.apache.org/mod_**
>>> mbox/jena-dev/201210.mbox/%**3CC0B6979A3CA668458B697E4EA907**
>>> CA940A01AF5C%40CFWEX01.**americas.cray.com%3E<http://mail-archives.apache.org/mod_mbox/jena-dev/201210.mbox/%3CC0B6979A3CA668458B697E4EA907CA940A01AF5C%40CFWEX01.americas.cray.com%3E>
>>> >
>>>
>>> [3] http://s.apache.org/lK
>>> -->
>>> http://mail-archives.apache.****org/mod_mbox/incubator-**
>>> clerezza-dev/201211.mbox/%****3CCAEWfVJ%3DcKATgo32u-AZDQKq%****
>>> 2BmsaVM_CWRnLo_OLdTYP1jFVzAw%****40mail.gmail.com%3E<http://**
>>> mail-archives.apache.org/mod_**mbox/incubator-clerezza-dev/**
>>> 201211.mbox/%3CCAEWfVJ%**3DcKATgo32u-AZDQKq%2BmsaVM_**
>>> CWRnLo_OLdTYP1jFVzAw%40mail.**gmail.com%3E<http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/201211.mbox/%3CCAEWfVJ%3DcKATgo32u-AZDQKq%2BmsaVM_CWRnLo_OLdTYP1jFVzAw%40mail.gmail.com%3E>
>>> >
>>>
>>>
>>
>

Re: [Discuss] Apache Portable Uniform RDF Runtime (PURR)

Reply via email to