On 22 Nov 2012 15:04, "Andy Seaborne" <a...@apache.org> wrote: > > Rob's comments on inverting the reader process [2] suggest to me pulling out an API and I wonder if we can identify a portability layer that enables some (not all) interoperability and mix-n-match. > > The term "API" is creating some confusion in the discussions triggered by the Clerezza incubator project being noted [1][3] as "low activity" > > To some, it's what the application sees -- a presentation API. To others it is some kind of abstraction between machinery like storage, inference, parsing and writing. They don't have to be the same. > > Even if the only outcome if parser and stream processing mouldarity, I think it is worth doing. Just being able to add an external "parser" to Jena in a cleaner way that is currently possible is useful. > > ** Apache Portable Uniform RDF Runtime (PURR) ** > > (OK - the "U" is a bit forced :-) > > To me, what we need is an abstraction that allows multiple implementations by swapping the jars (or OGSi bundles). So PURR is a set of interfaces. No state. c.f. SLJ4J. > > There would be many presentation APIs: Model-like, RDF-ORM, Ontology, and also for natural use in other JVM-based languages - Scala, Clojure, whatever is the next JVM language de jour. > > It's not a full application library. It's rather low level. Writing much code directly at the interface may not be pretty. > > This is not the Jena graph SPI although that was trying to preform that purpose but has wider coverage of functionality. I think we can go more minimal yet. > > The Jena Graph SPI has a number of handlers - events, stats, transactions - which seem to make the problem too large. These would be part of another subsystem ("extends PURR") and be different in different providers. One of those would be Jena Graph. > > PURR would provide the basic concepts from RDF: > > Terms: IRIs, Literals, bNodes > Triples and Quads > Graph, Dataset > Factories for each. > > and for each be quite vanilla. > > e.g. a literal is a lexical form, a datatype and an optional language tag. Immutable with getters. Structural equality. No value, XSD or otherwise. > > (I'll do a quick sketch in another message - but don't read it as fixed, just a concrete discussion point) > > Parsers: > > Parsers, and in the general sense of anything that produced RDF from whatever input, be it an RDF syntax or mapping another data format (a conversion process), need and input stream and a factory, and emits > Triples, Quads comprising of terms. That don't need a full "graph" - they need a destination to send Triples/Quads (or be pull parsers). > > Writers: > > Writing is not the reverse of parsing - parsers produce a stream, writers for Turtle etc need to poke around the graph to decide what will "look nice". Even N-triples written clustered by subject can be useful. > > Negatives: > > 1/ It's wildly ambitious and impractical to even consider portability and abstraction. Too much time has passed. Waste of effort. > > 2/ The portability layer is so narrow that it is not helpful. > > 3/ No SPARQL. > (counter: (1) SPARQL is a remote protocol - this is same-JVM). > (counter: (2) develop a SPARQL API using PURR basic terms) > > > Opinions?
Considering projects such as Any23 (currently not using Jena) and Marmotta (about entering incubation and not using Jena), it's s good thing to try doing. I suppose this will be a module within Jena. Would Any23 or Marmotta use it or contribute to it? Paolo > > Andy > > > [1] http://wiki.apache.org/incubator/November2012 > > [2] http://s.apache.org/KCv > --> > http://mail-archives.apache.org/mod_mbox/jena-dev/201210.mbox/%3CC0B6979A3CA668458B697E4EA907CA940A01AF5C%40CFWEX01.americas.cray.com%3E > > [3] http://s.apache.org/lK > --> > http://mail-archives.apache.org/mod_mbox/incubator-clerezza-dev/201211.mbox/%3CCAEWfVJ%3DcKATgo32u-AZDQKq%2BmsaVM_CWRnLo_OLdTYP1jFVzAw%40mail.gmail.com%3E