I've attempted to document this thread in preparation for the London committer's meeting:
http://fedora-commons.org/confluence/display/DEV/Supporting+the+Semantic+Web +and+Linked+Data > -----Original Message----- > From: Steve Bayliss [mailto:[email protected]] > Sent: 06 November 2009 21:36 > To: [email protected] > Subject: [Fedora-commons-developers] The REST API,The > Resource Index and the Semantic Web > > > Thinking over the current debates over the REST API, particularly > manipulating relationships, and how the resource index fits > in with this, I > wonder if there is some unified approach that could be used > to relate all of > these together in a semantic web-friendly, REST-friendly, Web > 2.0-friendly > model. > > Asger's work on Enhanced Content Models, and particularly the > ideas around a > "reference counting" mechanism for triples to get around some of the > limitations with the current single-graph resource index, > plus our own work > on having arbitrary RDF datastreams propagated to the > resource index (and > the inherent problems with this) also feeds into this > thinking, along with > Carsten Friedrich's recent post expressing a desire for a > relationships API > that is not tied to needing to manipulate individual > RELS-EXT, RELS-INT and > DC datastreams. Ben Armintor's comments on the wiki on a (sub-) > graph-centric approach to manipulating relationships is also relevant. > > This is early-stage thinking, but I thought it might be > useful to get these > ideas out there, albeit in a bit of a raw state. And > spending too long > trying to define a vision of where you want to get to can get > in the way of > actually getting there... > > And what follows is pretty dependent on Fedora's Resource Index being > enabled, it is also Mulgara-centric, which is not exactly in line with > current thinking. So completely ignoring the > "triplestore-is-only-a-cache-and-might-not-even-exist" issue... > > So: > > Fundamentally two "kinds" of APIs: > > 1) an API much as the current SOAP API, with a > Fedora-object-centric view of > the world, for manipulating objects, datastreams, disseminators etc > > 2) a "semweb" API, with an RDF graph expression(s) of the > Fedora repository, > where resource URIs in the graph (objects, datastreams, > disseminators etc) > are resolvable, and are REST endpoints both for disseminating > the contents > of the repository (bitstreams, resource metadata, RDF graphs > describing > resources etc), and making changes to the repository, using > REST semantics. > So you could navigate the resource index to discover > resources, then use the > resource identifiers as REST endpoints. > > So essentially the "semweb" API would represent a > coming-together of the > REST API and the resource index. I think Asger's current > proposal for an > alternative REST API would fit in very well with this in > terms of exposing > the kind of REST endpoints that would be needed - and would > provide the > resolvable resource URIs for the RDF representation(s). > > The Resource Index and graphs (models) > ====================================== > Currently the Fedora Resource Index is a single graph, <#ri> (or > <rmi://someserver/fedora#ri>). > > Mulgara supports creation of multiple models (or graphs) and > querying across > multiple graphs. (Fedora does make use of additional graphs, > a datatyping > graph, and a full text model if full text indexing is enabled). > > Mulgara also supports creation of "View" models which do not > hold triples, > but are a view over multiple models, for instance the union of several > graphs: http://docs.mulgara.org/itqloperations/views.html > > It should therefore be possible to express a Fedora > repository as a set of > individual graphs whilst still presenting an overall single > graph view of > the repository; with sub-graphs being individually identifiable. > > Essentially some kind of hierarchy of graphs and views, for > example (please > ignore the actual model/graph identifiers used below, I've > not thought those > through... this is just for conceptual illustration!). (and > note that these > are not Fedora resource identifiers - they are identifiers > for graphs and > sub-graphs describing Fedora resources, with triples > containing URIs that > resolve to Fedora resources.) > > <#ri> - a view containing: > <#some:pid> - object graph for some:pid, a view containing: > <#some:pid/properties> - graph containing object properties > <#some:pid/datastreams> - a view containing: > <#some:pid/datastreams/rels-ext> - graph containing triples from > rels-ext > <#some:pid/datastreams/rels-int> - graph containing triples from > rels-int > <#some:pid/datastreams/dc> - graph containing triples from DC > <#some:pid/datastreams/{rdf datastream}> - graph > containing triples > from some other rdf datastream > <#some:pid/datastreams/{dsid}/properties> - graph containing > properties of datastream {dsid} (state, last modified, etc) > <#some:otherpid> - object graph for some:otherpid, a view > containing: > <#some:otherpid/properties> - etc > <#some:otherpid/datastreams> - etc > ... > > There's undoubtedly stuff I haven't thought about that should > be included > above (notably disseminators). And there's probably a better > design of this > hierarchy. But as a principle... > > The top-level <#ri> graph would still look like it does today. > > This top level view could be (disseminated from) a "special" > Fedora object > representing the repository itself (an idea I know has been floating > around). > > This could get around the situation where if one allowed arbitrary RDF > datastreams to be propagated to the resource index, and two > datastreams > assert the same triple, deletion of one of the datastreams results in > deletion of the triple in the resource index although the > triple is still > being asserted by the second datastream. > > In the above example, if a triple was asserted by two > different datastreams > then the triple would be present in two different graphs (one > graph for each > datastream). The top level <#ri> view would show a single > triple, however > deletion of the triple from one rdf datastream would result > in it being > removed from one graph whilst still leaving it in the graph > for the other > datastream, and therefore it would still be asserted in the > resource index. > > Resolvable RI URIs - being more Semantic Web- and Web 2.0-friendly > ================================================================== > The resource index uses the "fedora" namespace in the info > uri scheme to > identify objects, datastreams, disseminators etc, eg > <info:fedora/some:pid>. > > It could also be useful to also expose resolvable URIs in the resource > index, as an alternative view. For instance, something akin to a > URL-rewriting mechanism could be used to transform > <info:fedora/some:pid> > into http://server:port/fedora/objects/some:pid (using the proposed > alternative REST API syntax). > > On the way in, queries (updates, etc) would have resolvable > http identifiers > translated back to the info:fedora scheme. (So RELS-EXT, > RELS-INT etc would > continue to use the info:fedora scheme.) > > Essentially this would be an "external" view of the resource index > containing resolvable URIs for Fedora resources that are also REST > endpoints. > > It should also be possible to disseminate sub-graphs with > resolvable URIs as > (for example) OAI-ORE resource maps. > > Mapping between Fedora objects and the resource index > ===================================================== > Currently the specification of what triples get created for > Fedora objects, > datastreams and properties is embodied in imperative Java code. > > It could be possible to move this to a declarative > specification, perhaps as > part of the CMA. > > For instance the base content model that every object belongs to could > specify: > - an XSLT for generating the "system" triples for Fedora object and > datastream properties, relationships between objects, datastreams and > disseminators; and which graph the triples should be added to > - an XSLT for generating triples from RELS-EXT; and which > graph the triples > should be added to > - an XSLT for generating triples from RELS-INT; and which > graph the triples > should be added to > > "User" content models could for instance specify that XML metadata > datastream xyz should be converted using an XSLT into RDF, > and the content > model would also indicate what graph the triples should be created in. > > (XSLT is just used as an example, there may be better/alternative > approaches, such as GRDDL, and a combination of methods may be best) > > Validation criteria (rdf schema, ontology, xml schema etc) > could also be > defined in a similar manner. > > Unified relationships API > ========================= > Having declarative specifications of the relationship between > graphs in the > resource index and the Fedora object model would help in > implementing a > unified relatinoships API - ie a method of specifying modifications to > triples at the repository level, with the API resolving this > to what it > represents in terms of Fedora objects/datastreams and performing the > necessary modifications on these. > > Persistence is fundamental - all relationships should be stored in the > filesystem - adding triples to Mulgara without persisting > them in the Fedora > object model should not be allowed. > > This needs thinking about more, for instance if an arbitrary > triple is to be > added, what object should it be stored in (that is a triple > that does not > make an assertion about a Fedora object or datastream for > example)? Should > it be possible to add a triple(s) that assert a new > datastream or Fedora > object? (ie having a completely RDF-centric API). > > > > Regards > Steve > > > -------------------------------------------------------------- > ---------------- > Let Crystal Reports handle the reporting - Free Crystal > Reports 2008 30-Day > trial. Simplify your report design, integration and > deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Fedora-commons-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers > ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Fedora-commons-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
