On Mon, Nov 12, 2012 at 10:40 PM, Andy Seaborne <[email protected]> wrote:
> On 12/11/12 19:42, Reto Bachmann-Gmür wrote: > >> On Mon, Nov 12, 2012 at 5:46 PM, Andy Seaborne <[email protected]> wrote: >> >> On 09/11/12 09:56, Rupert Westenthaler wrote: >>> >>> RDF libs: >>>> ==== >>>> >>>> Out of the viewpoint of Apache Stanbol one needs to ask the Question >>>> if it makes sense to manage an own RDF API. I expect the Semantic Web >>>> Standards to evolve quite a bit in the coming years and I do have >>>> concern that the Clerezza RDF modules will be updated/extended to >>>> provide implementations of those. One example of such an situation is >>>> SPARQL 1.1 that is around for quite some time and is still not >>>> supported by Clerezza. While I do like the small API, the flexibility >>>> to use different TripleStores and that Clerezza comes with OSGI >>>> support I think given the current situation we would need to discuss >>>> all options and those do also include a switch to Apache Jena or >>>> Sesame. Especially Sesame would be an attractive option as their RDF >>>> Graph API [1] is very similar to what Clerezza uses. Apache Jena's >>>> counterparts (Model [2] and Graph [3]) are considerable different and >>>> more complex interfaces. In addition Jena will only change to >>>> org.apache packages with the next major release so a switch before >>>> that release would mean two incompatible API changes. >>>> >>>> >>> Jena isn't changing the packaging as such -- what we've discussed is >>> providing a package for the current API and then a new, org.apache API. >>> The new API may be much the same as the existing one or it may be >>> different - that depends on contributions made! >>> >>> >> I didn't know about jena planning to introduce such a common API. >> >> >>> I'd like to hear more about your experiences esp. with Graph API as that >>> is supposed to be quite simple - it's targeted at storage extensions as >>> well as supporting the richer Model API. Personally, aside from the fact >>> that Clerreza enforces slot constraints (no literals as subjects), the >>> Jena >>> Graph API and Clerezza RDF core API seem reasonably aligned. >>> >>> >> Yes the slot constraints comes from the RDF abstract syntax. In my opinion >> it's something one could decide to relax, by adding appropriate owl:sameAs >> bnode any graph could be transformed to an rdf-abstract-syntax compliant >> one. So maybe have a GnereicTripleCollection that can be converted to an >> RDFTRipleCollection - not sure. Just sticking to the spec and wait till >> this is allowed by the abstract syntax might be the easiest. >> > > At the core, unconstrained slots has worked best for us. > The question is shall this be part of a common API. For machinering doing inference and dealing with the meaning of RDF graphs resources should also be associated to a set of IRIs (that serialize into oswl:sameAs). > > Then either: > > 1/ have a test like: > Triple.isValidRDF > > 2/ Layer an app API to impose the constraints (but it's easy to run out of > good names). > The clerezza API would be such a layer. > > > The Graph/Node/Triple level in Jena is an API but it's primary role is the > other side, to storage and inference, not apps. > > Generality gives > A/ Future proofing (not perfect) > B/ Arises in inference and query naturally. > C/ using RDF structures for processing RDF > > Nodes in triples can be variables, and I would have found it useful to > have marker nodes to be able to build structures e.g. "known to be bound at > this point in a query". As it was, I ended up creating parallel structures. > > > Where I see advantages of the clerezza API: >> - Bases on collections framework so standard tools can be used for graphs >> > > Given a core system API, a scala and clojure and even different Java APIs > for difefrent styles are all possible. > Right. That's why I propose having a minimum API and decorators as to provide scala interfacing or the resource api for java ( which corresponds more or less to the W3C RDF API draft) > > A universal API across systems is about plugging in machinery (parser, > query engines, storage, inference). It's good to separate that from > application APIs otherwise there is a design tension. I'm wondering if there need to be specia hooks for inference or if this cannot just as well be done by simply wrapping the graphs. > > > - Immutable graphs follow identity criterion of RDF semantics, this allows >> graph component to be added to sets and more straight forwardly implement >> diff and patch algorithms >> - BNode have no ids: apart from promoting the usage of URIs where this is >> appropriate it allows behind the scenes leanification and saves memory >> where the backend doesn't hast such ids. >> > > We have argued about this before. > > + As you have objects, there is a concept of identity (you can tell two > bNodes apart). > No, two bnodes might be indistinguisgibe as in a :knows b b : knows a You cannot tell them apart even though none of them can be leanified away > + For persistence, an internal id is necessary to reconstruct consistently > with caches. > Here we are talking about some implementation stuff that imho should be separate from API discussion. Do you accept my Toy-usecase challenge [1], if we leave the classical dedicate triple store usecase scenario the id quickly becomes something that makes things harder rather than easier. > + Leaning isn't a core feature of RDF. In fact, IIRC, mention is going to > be removed. It's information reduction, not data reduction. > It simply arises from bnodes being existential variables. If they are eredined to be something else then I have difficulties to see what advantages they wold still offer to named nodes (maybe in some slolem: uri scheme) > + There will be a have a skolemization Note from RDF-WG to deal with the > practical matters of dealing with bNodes. > > RDF as data model for linked data. > > Its a datastructure with good properties for combining. And it has links. > > > >> >> >> >>> (for generalised systems such as rules engine - and for SPARQL - triples >>> can arise with extras like literals as subjects; they get removed later) >>> >> >> >> If this shall be an API for interoperability based on RDF standard I'm >> wonder if is shall be possible to expose such intermediate constructs. >> > > My suggestion is that the API for interoperability is designed to support > RDF standards. > > The key elements are IRIs, literals, Triples, Quads, Graphs, Datasets. > Datasets are an element of the relevant sparql spec, I don't see Quads. > > But also storage, SPARQL (Query and Update), and web access (e.g. conneg). > Clerezza is very stong on conneg but I don't think this would be part of the rdf core api, but rather of the parts that could be part of Stanbol and provide a Linked Data Platform Container (LDPC). Reto 1. http://mail-archives.apache.org/mod_mbox/stanbol-dev/201211.mbox/%3CCALvhUEUfOd-mLBh-%3DXkwbLAJHBcboE963hDxv6g0jHNPj6cxPQ%40mail.gmail.com%3E
