+1 in praise of this idea On Nov 7, 2009, at 4:48 AM, Asger Askov Blekinge wrote:
> I very much like what you are thinking here. > > > On Fri, 2009-11-06 at 22:36 +0100, Steve Bayliss wrote: >> Thinking over the current debates over the REST API, particularly >> manipulating relationships, and how the resource index fits in with >> this, I >> wonder if there is some unified approach that could be used to >> relate all of >> these together in a semantic web-friendly, REST-friendly, Web 2.0- >> friendly >> model. >> >> Asger's work on Enhanced Content Models, and particularly the ideas >> around a >> "reference counting" mechanism for triples to get around some of the >> limitations with the current single-graph resource index, plus our >> own work >> on having arbitrary RDF datastreams propagated to the resource >> index (and >> the inherent problems with this) also feeds into this thinking, >> along with >> Carsten Friedrich's recent post expressing a desire for a >> relationships API >> that is not tied to needing to manipulate individual RELS-EXT, RELS- >> INT and >> DC datastreams. Ben Armintor's comments on the wiki on a (sub-) >> graph-centric approach to manipulating relationships is also >> relevant. >> >> This is early-stage thinking, but I thought it might be useful to >> get these >> ideas out there, albeit in a bit of a raw state. And spending too >> long >> trying to define a vision of where you want to get to can get in >> the way of >> actually getting there... >> >> And what follows is pretty dependent on Fedora's Resource Index being >> enabled, it is also Mulgara-centric, which is not exactly in line >> with >> current thinking. So completely ignoring the >> "triplestore-is-only-a-cache-and-might-not-even-exist" issue... >> > > As far as I can see, you actually assume that the triple store is > only a > cache, but you do require that it exist. "Triple store is only a > cache" > means somewhat the same as "every triple in the triplestore should be > expressed in one of the objects" > > > >> So: >> >> Fundamentally two "kinds" of APIs: >> >> 1) an API much as the current SOAP API, with a Fedora-object- >> centric view of >> the world, for manipulating objects, datastreams, disseminators etc >> >> 2) a "semweb" API, with an RDF graph expression(s) of the Fedora >> repository, >> where resource URIs in the graph (objects, datastreams, >> disseminators etc) >> are resolvable, and are REST endpoints both for disseminating the >> contents >> of the repository (bitstreams, resource metadata, RDF graphs >> describing >> resources etc), and making changes to the repository, using REST >> semantics. >> So you could navigate the resource index to discover resources, >> then use the >> resource identifiers as REST endpoints. >> >> So essentially the "semweb" API would represent a coming-together >> of the >> REST API and the resource index. I think Asger's current proposal >> for an >> alternative REST API would fit in very well with this in terms of >> exposing >> the kind of REST endpoints that would be needed - and would provide >> the >> resolvable resource URIs for the RDF representation(s). >> >> The Resource Index and graphs (models) >> ====================================== >> Currently the Fedora Resource Index is a single graph, <#ri> (or >> <rmi://someserver/fedora#ri>). >> >> Mulgara supports creation of multiple models (or graphs) and >> querying across >> multiple graphs. (Fedora does make use of additional graphs, a >> datatyping >> graph, and a full text model if full text indexing is enabled). >> >> Mulgara also supports creation of "View" models which do not hold >> triples, >> but are a view over multiple models, for instance the union of >> several >> graphs: http://docs.mulgara.org/itqloperations/views.html >> >> It should therefore be possible to express a Fedora repository as a >> set of >> individual graphs whilst still presenting an overall single graph >> view of >> the repository; with sub-graphs being individually identifiable. >> >> Essentially some kind of hierarchy of graphs and views, for example >> (please >> ignore the actual model/graph identifiers used below, I've not >> thought those >> through... this is just for conceptual illustration!). (and note >> that these >> are not Fedora resource identifiers - they are identifiers for >> graphs and >> sub-graphs describing Fedora resources, with triples containing >> URIs that >> resolve to Fedora resources.) >> >> <#ri> - a view containing: >> <#some:pid> - object graph for some:pid, a view containing: >> <#some:pid/properties> - graph containing object properties >> <#some:pid/datastreams> - a view containing: >> <#some:pid/datastreams/rels-ext> - graph containing triples from >> rels-ext >> <#some:pid/datastreams/rels-int> - graph containing triples from >> rels-int >> <#some:pid/datastreams/dc> - graph containing triples from DC >> <#some:pid/datastreams/{rdf datastream}> - graph containing >> triples >> from some other rdf datastream >> <#some:pid/datastreams/{dsid}/properties> - graph containing >> properties of datastream {dsid} (state, last modified, etc) >> <#some:otherpid> - object graph for some:otherpid, a view >> containing: >> <#some:otherpid/properties> - etc >> <#some:otherpid/datastreams> - etc >> ... >> >> There's undoubtedly stuff I haven't thought about that should be >> included >> above (notably disseminators). And there's probably a better >> design of this >> hierarchy. But as a principle... >> >> The top-level <#ri> graph would still look like it does today. >> >> This top level view could be (disseminated from) a "special" Fedora >> object >> representing the repository itself (an idea I know has been floating >> around). >> >> This could get around the situation where if one allowed arbitrary >> RDF >> datastreams to be propagated to the resource index, and two >> datastreams >> assert the same triple, deletion of one of the datastreams results in >> deletion of the triple in the resource index although the triple is >> still >> being asserted by the second datastream. >> >> In the above example, if a triple was asserted by two different >> datastreams >> then the triple would be present in two different graphs (one graph >> for each >> datastream). The top level <#ri> view would show a single triple, >> however >> deletion of the triple from one rdf datastream would result in it >> being >> removed from one graph whilst still leaving it in the graph for the >> other >> datastream, and therefore it would still be asserted in the >> resource index. > > And you have thus beautifully solved an old Fedora problem! > >> >> Resolvable RI URIs - being more Semantic Web- and Web 2.0-friendly >> ================================================================== >> The resource index uses the "fedora" namespace in the info uri >> scheme to >> identify objects, datastreams, disseminators etc, eg <info:fedora/ >> some:pid>. >> >> It could also be useful to also expose resolvable URIs in the >> resource >> index, as an alternative view. For instance, something akin to a >> URL-rewriting mechanism could be used to transform <info:fedora/ >> some:pid> >> into http://server:port/fedora/objects/some:pid (using the proposed >> alternative REST API syntax). >> >> On the way in, queries (updates, etc) would have resolvable http >> identifiers >> translated back to the info:fedora scheme. (So RELS-EXT, RELS-INT >> etc would >> continue to use the info:fedora scheme.) >> >> Essentially this would be an "external" view of the resource index >> containing resolvable URIs for Fedora resources that are also REST >> endpoints. >> >> It should also be possible to disseminate sub-graphs with >> resolvable URIs as >> (for example) OAI-ORE resource maps. >> >> Mapping between Fedora objects and the resource index >> ===================================================== >> Currently the specification of what triples get created for Fedora >> objects, >> datastreams and properties is embodied in imperative Java code. >> >> It could be possible to move this to a declarative specification, >> perhaps as >> part of the CMA. >> >> For instance the base content model that every object belongs to >> could >> specify: >> - an XSLT for generating the "system" triples for Fedora object and >> datastream properties, relationships between objects, datastreams and >> disseminators; and which graph the triples should be added to >> - an XSLT for generating triples from RELS-EXT; and which graph the >> triples >> should be added to >> - an XSLT for generating triples from RELS-INT; and which graph the >> triples >> should be added to >> >> "User" content models could for instance specify that XML metadata >> datastream xyz should be converted using an XSLT into RDF, and the >> content >> model would also indicate what graph the triples should be created >> in. >> >> (XSLT is just used as an example, there may be better/alternative >> approaches, such as GRDDL, and a combination of methods may be best) > I was actually thinking that this could be expressed as disseminators. > Then the content model would only have to express which disseminator > to > call. > > >> >> Validation criteria (rdf schema, ontology, xml schema etc) could >> also be >> defined in a similar manner. >> >> Unified relationships API >> ========================= >> Having declarative specifications of the relationship between >> graphs in the >> resource index and the Fedora object model would help in >> implementing a >> unified relatinoships API - ie a method of specifying modifications >> to >> triples at the repository level, with the API resolving this to >> what it >> represents in terms of Fedora objects/datastreams and performing the >> necessary modifications on these. >> >> Persistence is fundamental - all relationships should be stored in >> the >> filesystem - adding triples to Mulgara without persisting them in >> the Fedora >> object model should not be allowed. > And thus the triple store IS only a cache ;) But this is required. > > >> >> This needs thinking about more, for instance if an arbitrary triple >> is to be >> added, what object should it be stored in (that is a triple that >> does not >> make an assertion about a Fedora object or datastream for >> example)? Should >> it be possible to add a triple(s) that assert a new datastream or >> Fedora >> object? (ie having a completely RDF-centric API). > I feel a useful distinction could be statements to create new graphs, > and statements to add triples to a graph. Graphs should only be > created > through the "traditional" API, as they create new objects and > datastreams. > > About modifying the content of say the DC datastream through the > triple > store, for that to work we need a way to map rdf statements back into > dublin core xml. This could be done by having two XSLTs and marking > those graphs that cannot map back as writeprotected. > > Regards > > >> >> >> >> Regards >> Steve >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports >> 2008 30-Day >> trial. Simplify your report design, integration and deployment - >> and focus on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Fedora-commons-developers mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/fedora-commons- >> developers > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 > 30-Day > trial. Simplify your report design, integration and deployment - and > focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Fedora-commons-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fedora-commons-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers
