Re: [Fedora-commons-developers] The REST API, The Resource Index and the Semantic Web

Steve Bayliss Sun, 21 Feb 2010 05:05:31 -0800

I've attempted to document this thread in preparation for the London
committer's meeting:


http://fedora-commons.org/confluence/display/DEV/Supporting+the+Semantic+Web
+and+Linked+Data



> -----Original Message-----
> From: Steve Bayliss [mailto:[email protected]] 
> Sent: 06 November 2009 21:36
> To: [email protected]
> Subject: [Fedora-commons-developers] The REST API,The 
> Resource Index and the Semantic Web
> 
> 
> Thinking over the current debates over the REST API, particularly
> manipulating relationships, and how the resource index fits 
> in with this, I
> wonder if there is some unified approach that could be used 
> to relate all of
> these together in a semantic web-friendly, REST-friendly, Web 
> 2.0-friendly
> model.
> 
> Asger's work on Enhanced Content Models, and particularly the 
> ideas around a
> "reference counting" mechanism for triples to get around some of the
> limitations with the current single-graph resource index, 
> plus our own work
> on having arbitrary RDF datastreams propagated to the 
> resource index (and
> the inherent problems with this) also feeds into this 
> thinking, along with
> Carsten Friedrich's recent post expressing a desire for a 
> relationships API
> that is not tied to needing to manipulate individual 
> RELS-EXT, RELS-INT and
> DC datastreams.  Ben Armintor's comments on the wiki on a (sub-)
> graph-centric approach to manipulating relationships is also relevant.
> 
> This is early-stage thinking, but I thought it might be 
> useful to get these
> ideas out there, albeit in a bit of a raw state.  And 
> spending too long
> trying to define a vision of where you want to get to can get 
> in the way of
> actually getting there...
> 
> And what follows is pretty dependent on Fedora's Resource Index being
> enabled, it is also Mulgara-centric, which is not exactly in line with
> current thinking.  So completely ignoring the
> "triplestore-is-only-a-cache-and-might-not-even-exist" issue...
> 
> So:
> 
> Fundamentally two "kinds" of APIs:
> 
> 1) an API much as the current SOAP API, with a 
> Fedora-object-centric view of
> the world, for manipulating objects, datastreams, disseminators etc
> 
> 2) a "semweb" API, with an RDF graph expression(s) of the 
> Fedora repository,
> where resource URIs in the graph (objects, datastreams, 
> disseminators etc)
> are resolvable, and are REST endpoints both for disseminating 
> the contents
> of the repository (bitstreams, resource metadata, RDF graphs 
> describing
> resources etc), and making changes to the repository, using 
> REST semantics.
> So you could navigate the resource index to discover 
> resources, then use the
> resource identifiers as REST endpoints.
> 
> So essentially the "semweb" API would represent a 
> coming-together of the
> REST API and the resource index.  I think Asger's current 
> proposal for an
> alternative REST API would fit in very well with this in 
> terms of exposing
> the kind of REST endpoints that would be needed - and would 
> provide the
> resolvable resource URIs for the RDF representation(s).
> 
> The Resource Index and graphs (models)
> ======================================
> Currently the Fedora Resource Index is a single graph, <#ri> (or
> <rmi://someserver/fedora#ri>).
> 
> Mulgara supports creation of multiple models (or graphs) and 
> querying across
> multiple graphs.  (Fedora does make use of additional graphs, 
> a datatyping
> graph, and a full text model if full text indexing is enabled).
> 
> Mulgara also supports creation of "View" models which do not 
> hold triples,
> but are a view over multiple models, for instance the union of several
> graphs: http://docs.mulgara.org/itqloperations/views.html
> 
> It should therefore be possible to express a Fedora 
> repository as a set of
> individual graphs whilst still presenting an overall single 
> graph view of
> the repository; with sub-graphs being individually identifiable.
> 
> Essentially some kind of hierarchy of graphs and views, for 
> example (please
> ignore the actual model/graph identifiers used below, I've 
> not thought those
> through... this is just for conceptual illustration!).  (and 
> note that these
> are not Fedora resource identifiers - they are identifiers 
> for graphs and
> sub-graphs describing Fedora resources, with triples 
> containing URIs that
> resolve to Fedora resources.)
> 
> <#ri> - a view containing:
>   <#some:pid> - object graph for some:pid, a view containing:
>     <#some:pid/properties> - graph containing object properties
>     <#some:pid/datastreams> - a view containing:
>       <#some:pid/datastreams/rels-ext> - graph containing triples from
> rels-ext
>       <#some:pid/datastreams/rels-int> - graph containing triples from
> rels-int
>       <#some:pid/datastreams/dc> - graph containing triples from DC
>       <#some:pid/datastreams/{rdf datastream}> - graph 
> containing triples
> from some other rdf datastream
>       <#some:pid/datastreams/{dsid}/properties> - graph containing
> properties of datastream {dsid} (state, last modified, etc)
>   <#some:otherpid> - object graph for some:otherpid, a view 
> containing:
>     <#some:otherpid/properties> - etc
>     <#some:otherpid/datastreams> - etc
>       ...
> 
> There's undoubtedly stuff I haven't thought about that should 
> be included
> above (notably disseminators).  And there's probably a better 
> design of this
> hierarchy.  But as a principle...
> 
> The top-level <#ri> graph would still look like it does today.
> 
> This top level view could be (disseminated from) a "special" 
> Fedora object
> representing the repository itself (an idea I know has been floating
> around).
> 
> This could get around the situation where if one allowed arbitrary RDF
> datastreams to be propagated to the resource index, and two 
> datastreams
> assert the same triple, deletion of one of the datastreams results in
> deletion of the triple in the resource index although the 
> triple is still
> being asserted by the second datastream.
> 
> In the above example, if a triple was asserted by two 
> different datastreams
> then the triple would be present in two different graphs (one 
> graph for each
> datastream).  The top level <#ri> view would show a single 
> triple, however
> deletion of the triple from one rdf datastream would result 
> in it being
> removed from one graph whilst still leaving it in the graph 
> for the other
> datastream, and therefore it would still be asserted in the 
> resource index.
> 
> Resolvable RI URIs - being more Semantic Web- and Web 2.0-friendly
> ==================================================================
> The resource index uses the "fedora" namespace in the info 
> uri scheme to
> identify objects, datastreams, disseminators etc, eg 
> <info:fedora/some:pid>.
> 
> It could also be useful to also expose resolvable URIs in the resource
> index, as an alternative view.  For instance, something akin to a
> URL-rewriting mechanism could be used to transform 
> <info:fedora/some:pid>
> into http://server:port/fedora/objects/some:pid (using the proposed
> alternative REST API syntax).
> 
> On the way in, queries (updates, etc) would have resolvable 
> http identifiers
> translated back to the info:fedora scheme.  (So RELS-EXT, 
> RELS-INT etc would
> continue to use the info:fedora scheme.)
> 
> Essentially this would be an "external" view of the resource index
> containing resolvable URIs for Fedora resources that are also REST
> endpoints.
> 
> It should also be possible to disseminate sub-graphs with 
> resolvable URIs as
> (for example) OAI-ORE resource maps.
> 
> Mapping between Fedora objects and the resource index
> =====================================================
> Currently the specification of what triples get created for 
> Fedora objects,
> datastreams and properties is embodied in imperative Java code.
> 
> It could be possible to move this to a declarative 
> specification, perhaps as
> part of the CMA.
> 
> For instance the base content model that every object belongs to could
> specify:
> - an XSLT for generating the "system" triples for Fedora object and
> datastream properties, relationships between objects, datastreams and
> disseminators; and which graph the triples should be added to
> - an XSLT for generating triples from RELS-EXT; and which 
> graph the triples
> should be added to
> - an XSLT for generating triples from RELS-INT; and which 
> graph the triples
> should be added to
> 
> "User" content models could for instance specify that XML metadata
> datastream xyz should be converted using an XSLT into RDF, 
> and the content
> model would also indicate what graph the triples should be created in.
> 
> (XSLT is just used as an example, there may be better/alternative
> approaches, such as GRDDL, and a combination of methods may be best)
> 
> Validation criteria (rdf schema, ontology, xml schema etc) 
> could also be
> defined in a similar manner.
> 
> Unified relationships API
> =========================
> Having declarative specifications of the relationship between 
> graphs in the
> resource index and the Fedora object model would help in 
> implementing a
> unified relatinoships API - ie a method of specifying modifications to
> triples at the repository level, with the API resolving this 
> to what it
> represents in terms of Fedora objects/datastreams and performing the
> necessary modifications on these.
> 
> Persistence is fundamental - all relationships should be stored in the
> filesystem - adding triples to Mulgara without persisting 
> them in the Fedora
> object model should not be allowed.
> 
> This needs thinking about more, for instance if an arbitrary 
> triple is to be
> added, what object should it be stored in (that is a triple 
> that does not
> make an assertion about a Fedora object or datastream for 
> example)?  Should
> it be possible to add a triple(s) that assert a new 
> datastream or Fedora
> object?  (ie having a completely RDF-centric API).
> 
> 
> 
> Regards
> Steve
> 
> 
> --------------------------------------------------------------
> ----------------
> Let Crystal Reports handle the reporting - Free Crystal 
> Reports 2008 30-Day 
> trial. Simplify your report design, integration and 
> deployment - and focus on 
> what you do best, core application coding. Discover what's new with
> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Fedora-commons-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
> 


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Re: [Fedora-commons-developers] The REST API, The Resource Index and the Semantic Web

Reply via email to