Re: [CODE4LIB] Expressing negatives and similar in RDF

Karen Coyle Wed, 18 Sep 2013 08:06:20 -0700

On 9/18/13 6:25 AM, [email protected] wrote:

-----BEGIN PGP SIGNED MESSAGE----


and without disagreeing with you, I would point out that if you say that a given type of resource 
can have at most one dct:title (which is easy to declare using OWL), and then apply that ontology 
to an instance that features a resource of that type with two dct:titles, you're going to get back 
useful information from the operation of your reasoner. An inconsistency in your claims about the 
world will become apparent. I now realize I should have been using the word "consistency" 
and not "validity".

I suppose what I really want to know, if you're willing to keep "playing 
reporter" on the workshop you attended, is whether there was an understanding 
present that people are using OWL in this way, and that it's useful in this way (far more 
useful than writing and maintaining lots and lots and lots of SPARQL) and that this is a 
use case for ontology languages.

The workshop was expressly on validation of data. No one reported using"reasoners" to do validation, and one speaker talked about relying onOWL for their validation rules (but admitted that it was all in theirclosed world and was a bit apologetic about it). I don't have experiencewith reasoners, but one of the issues for validation using SPARQL isgetting back specific information about what precise part of the queryreturned "false". I suspect that reasoners aren't good at returning suchinformation, since that is not their purpose. I don't believe that theyoperate on a T/F basis, but now I'll start looking into them.

One thing to remember about OWL is that it affects the semantics of yourclasses and properties in the open world. OWL intends to describe truthsabout a world of your design. It should affect not only your use of yourdata, but EVERYONE's use of your data in the cloud. Yet even you mayhave more than one application operating on the data, and thoseapplications may have different requirements. Also, remember that thegraph grows, so something that may be true at the moment of cataloging,for example, may not be true when your graph combines with other graphs.So you may say that there is one and only one main author to a worktitle, but that means one and only one URI. If your data combines withdata from another source, and that source has used a different authorURI, then what should happen? Each OWL rule makes a statement about asupposed reality, yet you may not have much control over that reality.Fewer rules ("least ontological commitment") means more possibilitiesfor re-use and re-combining of your data; more rules makes it very hardfor your data to play well in the world graph.

There are cases where OWL *increases* the utility of your properties andclasses, in particular declaring sub-class/sub-property relations. If wesay that RDA:titleProper is a subproperty of dct:title then anyone who"knows" dct:title can make use of RDA:titleProper. But OWL as a way to*restrict* the definition of the world should be used with caution.

I would like to see a discussion of what kinds of inferences we wouldlike to make (or see made) of our data in the open world, and then thoseinferences should inform how we would use OWL. Do we want to infer thatevery resource has a title? Obviously not, from how this discussionstarted. How about that every resource has a known creator? (Not) Do wewant to limit the number of titles or creators of a resource in theworld graph? The number of identities they can have? Does it make senseto say that a FRBR:Work can have an "adaptationOf" relationship *only*with another FRBR:Work (when no one except libraries is defining theirresources in terms of FRBR:Work)?

On the other hand, if two resources have the same title and the samedate, are they the same resource? (maybe, maybe not).

Oops. gotta run. I'm going to try to pull all of this together intosomething more coherent.


Thanks,
kc


- ---
A. Soroka
The University of Virginia Library

On Sep 17, 2013, at 11:00 PM, CODE4LIB automatic digest system wrote:

From: Karen Coyle <[email protected]>
Date: September 17, 2013 12:54:33 PM EDT
Subject: Re: Expressing negatives and similar in RDF

Agreed that SPARQL is ugly, and there was  discussion at the RDF validation 
workshop about the need for friendly interfaces that then create the 
appropriate SPARQL queries in the background. This shouldn't be surprising, 
since most business systems do not require users to write raw SQL or even 
anything resembling code - often users fill in a form with data that is turned 
into code.

But it really is a mistake to see OWL as a constraint language in the sense of 
validation. An ontology cannot constrain; OWL is solely *descriptive* not 
*prescriptive.* [1]

Inferencing is very different from validation, and this is an area where the 
initial RDF documentation was (IMO) quite unclear. The OWL 2 documents are 
better, but everyone admits that it's still an area of confusion. (In a major 
act of confession at the DC2013 meeting, Ivan Herman, head of the W3C semantic 
web work, said that this was a mistake that he himself made for many years. 
Fortunately, he now helps write the documentation, and it's good that he has 
that perspective.) In effect, inferencing is the *opposite* of constraining. 
Inferencing is:

"All men are liars. Socrates is a man. Therefore Socrates is a liar."
"Every child has a parent. Johnny is a child. Therefore, Johnny has a parent." 
(whether you can find one or not is irrelevant)
"Every child has two parents. Johnny is a child. Therefore Johnny has two parents. 
Mary is Johnny's parent." (no contradiction here, we just don't know who the other 
parent is)
"Every child has two parents. Johnny is a child. Therefore Johnny has two parents. 
Mary is Johnny's parent. Jane is Johnny's parent. Fred is Johnny's parent." Here the 
reasoner detects a contradiction.

The issue of dct:titles is an interesting example. dct:title takes a literal 
value. If you create a dct:title with:

X dct:title http://example.com/junk

with OWL rules that is NOT wrong. It simply provides the inference that 
"http://example.com/junk"; is a string - but it can't prevent you from creating 
that triple, because it only operates on existing data.

If you say that every resource MUST have a dct:title, then if you come across a 
resource without a dct:title that is NOT wrong. The reasoner would conclude 
that there is a dct:title somewhere because that's the rule.  (This is where 
the Open World comes in) When data contradicts reasoners, they can't work 
correctly, but they act on existing data, they do not modify or correct data.

I'm thinking that OWL and constraints would be an ideal training webinar, and I 
think I know who could do it!

kc

[1] http://www.w3.org/TR/2012/REC-owl2-primer-20121211/
"OWL 2 is not a schema language for syntax conformance. Unlike XML, OWL 2 does not 
provide elaborate means to prescribe how a document should be structured syntactically. 
In particular, there is no way to enforce that a certain piece of information (like the 
social security number of a person) has to be syntactically present. This should be kept 
in mind as OWL has some features that a user might misinterpret this way."

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJSOam0AAoJEATpPYSyaoIkGpoIAIsIMO+Ev2d/vdru8O9fQdz0
v770CxK1Dh/x3GHY9HO7mrbEBpF2IoEWfhuC5UfUunpaKUBybSCmngu9gBelRm59
AmPA6FAP+T/JT2cbDRKUXxkGf0v0qjgt4etALI/tdDK6Yhhtz2/hqvouJxxzvyld
PkATKiZVVSpIUT6pcz4nskOqVB8L1+ef8kfls06Va78Vboic5Y5vtZgvxS1fWIxZ
C0m9kwcvfVpBePbaaYm5mpoSuVJv/p6DE/tMdtt3H60Qgp8CPA9v+fMrq+DrVvZ6
DXAV4yUzAGTP5Qmkb4p+Ep3k08UN+O9ndlpsvz830pmE7S0aMeyu8lQKjImRiOE=
=K8Si
-----END PGP SIGNATURE-----


--
Karen Coyle
[email protected] http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet

Re: [CODE4LIB] Expressing negatives and similar in RDF

Reply via email to