Re: [Fedora-commons-developers] How RELS-INT breaks the Fedora paradigm and opens the door for new and innovative solutions to old problems

Daniel Davis Mon, 06 Jul 2009 09:40:08 -0700

I am concerned that the Fedora persistence/service model and the globalinformation model are being conflated. A public API is needed tofacilitate persistence operations but it is easy in Fedora to overlapthe persistence/service model with the public information model which isintended to suitable for use in graphs and in the Web architecture.Extending Fedora with writable service methods may make the separationmore clear but for now it is easy to conflate the global informationmodel and the persistence/service model. It is not helpful that thereare two ways to access content in Fedora, datastream disseminations andDO Service Methods (aka Operations)(formerly known as disseminators).Unfortunately, datastream disseminators are easy to understand and usewhile service methods are much harder and most applications avoid them.Fedora must provide a clear means to encapsulate the content or Fedorawill have problems evolving. Technically, datastream disseminationsdon't break encapsulation but having them facilitates thinking in termsof concrete datastreams instead of abstract stable resources. If youmake statements about the datastream itself you may step over the linesince the internal representation will change given enough time and maybreak the truthfulness of statements made in global models if internalinformation is exposed.

Internal use of RDF to support the encapsulation is a wonderful genericapproach to support encapsulation. But statements in the external(global) model must only be about facts which should be known externally(globally) and for a number of reasons, particularly preservation,should be relatively stable. Externally, datastreams should likely notbe objects and it would be best if external models avoid statementsabout concrete datastreams (and likely datastream disseminations) goingforward. Statements about the internal structure of the DO should notshow up in external models. External models should ignore the existenceof datastreams and assert statements about the DOs, services andmethods---you can get the same functionality without making statementsabout internal persistence artifacts that must change over time ifFedora is to evolve. If mapping to the Web architecture, theDO-service-method-parameters should preferably be a stable URI whichalso can be often be used in the Fedora architecture to cause thestreaming of a serialized representation. It would be desirable thatexternal URIs refer to some essential characteristic of the DO and itscontent that will always be true even if the concrete implementation ofthe object changes.

Internal models should be free to make statements about the internalstructure of the Fedora object and, I think but am not sure, can usestatements which are derived from external views of objects since suchstatements are supposed to be stable. However, there needs to be aseparation of concerns between the external (global) and internal views.Care should be taken about the visibility of internal statements. Theycannot be held to be globally true. Internal information can bepresented to the external model through the use of URIs establishedthrough the CMA using DO-service-method-parameters exercising care thatthere is an abstraction placed between the internal model and theexternal model. This enables statements made to global URIs to beglobally true.

There still needs to be methods that permit access to the internalstructures of the DO so that "privileged" applications can create theconcrete persisted internal content. These operations need to knowabout datastreams. I am not sure we know enough to create APIs whichabsolutely and clearly separate the external and internal models. Ithink that we need to keep the separation of the models clear in ourminds and exercise care when we use them and in the evolution ofFedora's design. In particular, I think we need to be careful when weextract this information into a triple-store where it is easy tocombine/conflate the two models and inadvertently mix statements whosetruthfulness is long term with transient implementation-specificinformation.


Daniel W. Davis
http://fedora-commons.org
[email protected]
(607) 255-6090 (Office)




Asger Blekinge-Rasmussen wrote:

Hi Frank

Thanks for the reply.

Yes, you definitely nailed down the the missing points.

RELS-INT and RELS-EXT are misnamed, for the very reason you wrote. No
contest there.

About the RELS-EXT relations to datastreams in the object, that was a
hack.
A fedora object has some relation, fedora-view:disseminates I think, to
each datastream belonging to this object. Since this is the same

relation to every datastream, it is not possible to define a OWLallValuesFrom restriction. In fact it is possible, but it has the effect

of demanding that all datastreams in the object is of the specified
class. Similarly, cardinality on that relation can only specify the
total number of datastreams.
I got around this by making my own relation (in RELS-EXT) to the
datastreams in the same object, but as you point out, these relations
could go to datastreams in another object.

Anyway, the introduction of RELS-INT does bring the current object
serialisation (foxml) into question. A datastream object conceptually
contains
  A ID
  Object properties (in RELS-INT)
  Content (In the datastream proper)
  Versioning (In the datastream proper)
  Audit trail (in the AUDIT datastream in the Object)

So a datastream object are serialised into three datastreams, it self,
RELS-INT and AUDIT. And the fedora object then gets a relation to this
object.
To accomadate the new conceptual structure, it would probably be simpler
to serialise each datastream to it's own xml file, and make links from
the fedora object to each datastream it "contains".
The problem with this approach is that the traditional Fedora objects
will just become a collection of datastreams, and properties about this
collection, and not data in itself. This could easily be modelled with a
datastream object, and thus we have come full circle. Objects will in
effect reduced to having just one datastream.
This idea is starting to scare me somewhat....

Regards



On Mon, 2009-07-06 at 12:14 +0200, Schwichtenberg, Frank wrote:

Hi Asger,
I absolutely agree with you. That seems to be the logical enhancement of
Enhanced Content Models. :-)

I just wonder if the idea of RELS-EXT and RELS-INT holds. So, you are right
pointing out datastreams are entities (or objects), now. They have URIs and it
is possible to make statements in RELS-INT with datastreams of other Fedora
objects as object(-of-the-statement). So far your idea to enhance Enhanced
Content Models, which is great I think.

My criticism on "the idea of RELS-EXT and RELS-INT" would be: One can refer datastreams extern to
the Fedora object the RELS-INT belongs to. And it is possible to refer entire Fedora objects from RELS-INT.
Obviously it is possible to refer datastreams from RELS-EXT, also such of the Fedora object the RELS-EXT
belongs to. So "EXT" and "INT" seems to be out-dated. The difference between RELS-EXT and
RELS-INT has nothing to do with relations to external or internal entities. But with making statements about
the object (RELS-EXT) or about parts of the object (RELS-INT). So datastream URIs in statements (both in
RELS-EXT and RELS-INT) bring in possible complexity.

I don't want to say that's bad; just thoughts. Maybe that is something people
are waiting for. And the possibility to specify datastream cardinality (maybe
min and max) is great.

Maybe that just brings us back to the question, why not just allow datastreams
of RDF/XML content which are automatically get propagated to the resource index.

Regards, Frank

-----Ursprüngliche Nachricht-----
Von: Asger Blekinge-Rasmussen [mailto:[email protected]]
Gesendet: Freitag, 3. Juli 2009 20:39
An: [email protected]
Betreff: [Fedora-commons-developers] How RELS-INT breaks the Fedora
paradigm and opens the door for new and innovative solutions to old
problems

Hi

Steve Bayliss have just finished adding the RELS-INT datastream to
Fedora, as announced on this list. I have been in some discussion with
him, as also shown on this list. This discussion have granted me a
chance to fully understand the conceptual change that RELS-INT brings.

In the semantic web paradigm, everything with an URI is a thing, which
can have properties and so on. But in Fedora, so far only Objects could
have properties (relations)

This all changed with the introduction of RELS-INT. Steve Bayliss have
made a system for, in a fedora object, specifying object properties
with
a datastream id as subject. No more, no less.

So datastreams are now objects, so to speak. They have a URI, and they
can have properties themselves. Formerly, there was the Fedora Object,
which had datastreams (blobs of data) and properties. Now there is the
Datastream, which has ONE blob of data, and properties. Fedora objects
now has a list of Datastreams, and properties for the object itself.
So we have two levels of objects. This is the way the Fedora paradigm
is
broken.

Big deal? Yes. Because if the datastreams can have relations, they can
have the hasModel/rdf:type relation. So, suddently we have a framework
for talking about the classes of datastreams. Now, like the content
models, there is the possibility to specify restrictions and demands on
the datastream, both it's relations and it's content.

Some might remember the old problem with the DS-COMPOSITE-MODEL
datastream. There is no way to specify datastreams that might be there,
only datastreams that have to be there, and there is no way to specify
cardinality for datastreams.
With the use of RELS-INT and enhanced content models, we can now
specify
something close to a solution to this problem.
Enhanced Content Models give the ability to define an ontology for
subscribing objects. This could include relations from the object to
the
objects datastreams. On such relations, Enhanced COntent Models give
the
ability to make cardinality demands, and specify the class/content
model
of range.
So, in the RELS-EXT for an object you could make this blob

<rdf:Description rdf:about="info:fedora/demo:object1">
  <fedora-system:hasModel rdf:resource="info:fedora/demo:cm1"/>
  <demo:hasDCdatastream rdf:resource="info:fedora/demo:object1/DC1"/>
  <demo:hasDCdatastream rdf:resource="info:fedora/demo:object1/DC2"/>
  <demo:hasDCdatastream rdf:resource="info:fedora/demo:object1/DC3"/>
</rdf:Description>

Then in the ontology we would specify something like
<owl:Class rdf:about="info:fedora/doms:ContentModel_DOMS">

 <rdfs:subClassOf>
    <owl:Restriction>
       <owl:onProperty rdf:resource="#hasDCdatastream"/>
          <owl:minCardinality
rdf:datatype="http://www.w3.org/2001/XMLSchema#integer";>3</owl:minCardi
nality>
    </owl:Restriction>
 </rdfs:subClassOf>
 <rdfs:subClassOf>
    <owl:Restriction>
       <owl:onProperty rdf:resource="#hasDCdatastream"/>
          <owl:allValuesFrom

rdf:resource="info:fedora/demo:DCdatastreamcontentModel"/>
          </owl:Restriction>
 </rdfs:subClassOf>
</owl:Class>
This basically says that demo:object1 must have at least three
hasDCdatastream relations to things of the type
demo:DCdatastreamcontentModel

This in the RELS-INT in demo:object1
<rdf:Description rdf:about="info:fedora/demo:object1/DC1">
  <fedora-system:hasModel
rdf:resource="info:fedora/demo:DCdatastramcontentModel"/>
</rdf:Description>
<rdf:Description rdf:about="info:fedora/demo:object1/DC2">
  <fedora-system:hasModel
rdf:resource="info:fedora/demo:DCdatastramcontentModel"/>
</rdf:Description>
<rdf:Description rdf:about="info:fedora/demo:object1/DC3">
  <fedora-system:hasModel
rdf:resource="info:fedora/demo:DCdatastramcontentModel"/>
</rdf:Description>


And voila, you have specified that objects of demo:cm1 must have at
least three datastreams, which all have a specific content model.


I have not fully thought everything above through, but I hope you get
the gist of it. I would like to hear other peoples thoughts on this.
Think of this as a preliminary on how RELS-INT can be used in enhanced
content models

Regards
Asger

Enhanced content models to be found on ecm.sourceforge.net



-----------------------------------------------------------------------
-------
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

-------------------------------------------------------

Fachinformationszentrum Karlsruhe, Gesellschaft für wissenschaftlich-technische Information mbH.Sitz der Gesellschaft: Eggenstein-Leopoldshafen, Amtsgericht Mannheim HRB 101892.Geschäftsführerin: Sabine Brünger-Weilandt.Vorsitzender des Aufsichtsrats: MinR Hermann Riehl.



------------------------------------------------------------------------------
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

------------------------------------------------------------------------------

_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Re: [Fedora-commons-developers] How RELS-INT breaks the Fedora paradigm and opens the door for new and innovative solutions to old problems

Reply via email to