Dear Steve,
Thank you for this great insight on your integration plan and apologies
for the tardy reaction.
I believe the plan is very sensible, so I'd like to keep you posted on
the ongoing overhauls on the Stanbol Ontology Network, Reasoner and Rule
managers (formerly KReS). Please feel free to inquire and discuss the
potential effect on integration with Fedora.
* The Java API now uses plain alphanumeric strings to identify scopes,
but you will still be able to access them through full URIs when using
the RESTful API.
* There is an ontology registry manager in place which allows you to
model the structure of ontology libraries in RDF and bootstrap your
network from the required ontologies in one take.
* Internally we are improving ontology network construction (so the
combination of scopes, spaces, sessions etc.) to natively use Clerezza,
which should give a major efficiency boost. Export as OWL API objects
(esp. for Axiom-wise polimorphism) will still be available.
* The philosophy behind Ontology Sessions will change a bit: we don't
plan on creating session spaces for each scope and each session anymore,
but a Session will be a single object that you can export as RDF and
where you can load ABoxes, attach as many scopes as you need.
* The Reasoner API is undergoing a major overhaul to accommodate
multiple service types and implementations. Background reasoning task
support is also in the works.
Rest assured that no such updates will take place without being
documented in the Stanbol issue tracker [1].
All the best,
Alessandro
[1] http://issues.apache.org/jira/browse/STANBOL
On 9/30/11 4:00 PM, Stephen Bayliss wrote:
Hello Stanbol developers.
We (Martin Dow and Stephen Bayliss) are working on the Stanbol early adopter
integration with the Fedora Commons digital object repository[1].
This post is to give you a heads-up on what we are seeking to do and some
background, and we do welcome your input and comments.
> From a functional perspective, the central use case we are tackling can be
described as applying ontologies and rules as a task-oriented lens over
content repository data, making use of Stanbol's KReS components in
particular.
First a note on the integration between Fedora and Stanbol. We are basing
this on Fedora's JMS capabilities, which we have enhanced. These messages
are then used to "synchronise" Stanbol with Fedora. (Fedora also provides a
REST API for access to content). We've enhanced the existing Fedora JMS
messaging capabilities as the messages were tightly-coupled to the Fedora
API, which means that message consumers need detailed knowledge of that.
Also in some cases some information that would be useful in interpreting
changes that have been made to Fedora content is not readily available in
these messages. We're hoping that this new messaging component will be
re-usable outside of the Stanbol integration and will bring value to the
Fedora community.
Conceptually the integration is similar to the CMS Adapter[2] - however as
Fedora's central index of content, the "Resource Index" (or "RI" [3]) is
implemented as an RDF graph of URIs, representing Fedora's notion of digital
objects and datastreams, and this is not the same as the JCR or CMIS view of
an object, we're not planning on using the CMS Adapter component directly.
In particular we haven't identified a need to "bridge" to the Fedora object
model as it's already expressible -- partly at least -- in RDFS. Currently
there's no formal schema for Fedora's "internal" object and datastream
relationships, but implementation relationships between objects expressed
in Fedora's RDF datastreams (RELS-EXT and RELS-INT) do have an RDF-S schema.
(For more information on Fedora relationships see [4].)
Fedora uses the Mulgara quad-store [5] as its RDF database, and it is
possible to expose this directly with a SPARQL endpoint. Furthermore
Clerezza can be configured to use Mulgara as its source. This therefore may
be one nice way of accessing Fedora's Resource Index, particularly as the
intent of the Resource Index is to provide access to the RDF view of
Fedora's data.
The repository content itself we are working with is a collection of images
and their metadata, catalogued using the VRA (Visual Resources Association)
XML metadata schema [6]. Some elements in the metadata records are
populated with items from controlled vocabularies and thesauri- in
particular artists are identified with persistent identifiers from the Getty
ULAN thesaurus [7], which we have converted to SKOS (and SKOS-XL) [8].
So in terms of semantics, there are three kinds we need to work with in
order to give the end-user a consistent experience during browse and
discovery - the repository structures as expressed within the Fedora
Resource Index, image metadata records (lifted from VRA into OWL), and Getty
thesaurus concepts and interrelations expressed as SKOS. These schemata are
loaded into KReS scopes, and alignment is needed. The Fedora schema is
static, as one would expect, given Fedora's key role as a stable repository
for digital contents, so is considered immutable in KReS. The VRA schema
and our mapping to the SKOS/SKOS-XL thesaurus we are evolving and so we
consider it shared and loaded at run-time.
Scopes are intended for task-oriented partitioning of the semantic data.
This comes into play in two scenarios. Firstly, when content is updated,
the additional objects must be added. Secondly, there are adding
constraints driven by user search or browse, eg when faceted browsing. So
the final step in our use case demonstration will be to use per-user
session scopes to handle this case. The intention is that data matching the
user's view must be synchronised in cached graphs in a dedicated session
scope; the results will be materialised by the reasoner. The DL reasoner
could also be used to check integrity constraints at the point of access.
Regards
Steve
[1] http://fedora-commons.org/
[2] http://wiki.iks-project.eu/index.php/CMS_Adaptor
[3] https://wiki.duraspace.org/display/FEDORA35/Resource+Index
[4]
https://wiki.duraspace.org/display/FEDORA35/Triples+in+the+Resource+Index
[5] http://www.mulgara.org/
[6] http://www.vraweb.org/projects/vracore4/index.html
[7] http://www.getty.edu/research/tools/vocabularies/ulan/
[8] http://www.w3.org/TR/skos-reference/
--
M.Sc. Alessandro Adamou
Alma Mater Studiorum - Università di Bologna
Department of Computer Science
Mura Anteo Zamboni 7, 40127 Bologna - Italy
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council (CNR)
Via Nomentana 56, 00161 Rome - Italy
"As for the charges against me, I am unconcerned. I am beyond their timid, lying
morality, and so I am beyond caring."
(Col. Walter E. Kurtz)