Jeremy Carroll wrote:
- resource descriptions and monotonicity
I got a bad non-monotonic feeling while reading the powder-grouping
WD; interestingly it was while reading bits that had clearly been
written with the issue in mind :(
This is getting harder ...
At first blush, the POWDER grouping document seems to have been written
with a view to the additive nature of RDF, and hence to respect
monotonicity.
The formal semantic definition I gave on friday for includeHosts follows
that pattern.
But .... it doesn't do what is wanted.
So here goes trying to describe what's written, and why it isn't what is
required; and I'm trying to get together in my head a positive solution,
which hides some of the complexity ... but that'll have to be in later
message. So while this is a negative message, please don't take it
pessimistically.
So, as well as say includeHosts, POWDER also allows to match on other
aspects of the URI e.g. includeSchemes.
<wdr:ResourceSet rdf:ID="A">
<wdr:includeSchemes>http https</wdr:includeSchemes>
<wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>
<wdr:ResourceSet rdf:ID="B">
<wdr:includeSchemes>http https</wdr:includeSchemes>
</wdr:ResourceSet>
<wdr:ResourceSet rdf:ID="C">
<wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>
gives three resource sets.
The formal semantics of includeHosts I gave on Friday, suggest that the
subject is a class all of whose members relate to the relevant host.
Thus the interpretation I(#A) will have a class extension that all come
from example.org, and so could also be the same as the class extension
of I(#C).
This is to reflect the monoticity in that the addition of the first
includeSchemes triple prohibits certain interpretations, but doesn't
license any interpretation that was not licensed in the first place.
But this contrasts directly with the explicit objective from the
grouping WD:
http://www.w3.org/TR/2007/WD-powder-grouping-20071031/#design
2 It must be possible to determine with certainty whether a given
resource is or is not an element of the Resource Set
Thus a resource identified by
ftp://www.example.org/pub/foo.txt
is necessarily in #C, but not in #A or #B.
In terms of OWL 1.1:
we could imagine a magic property hasURI which is given the obvious
semantics via a semantic extension (this would be slightly easier to
specify than the includeHosts property).
Then each of the properties in the groupings document can be seen as
restrictions on the hasURI property, with appropriate user defined
datatype to define the match.
e.g.
#A wdr:includeSchemes "http https" .
<==>
#A rdfs:subClassOf _:r .
_:r rdf:type owl:Restriction .
_:r owl:onProperty wdr:hasURI .
_:r owl:someValuesFrom _:d .
_:d rdf:type owl:DataRange .
_:d owl11:derivedFrom xsd:anyURI .
_:d owl11:onFacet xsd:pattern .
_:d owl11:constraint "^(http|https):" .
i.e. we consider the class of all things that have a hasURI property
with a value which conforms with the (anonymous) datatype derived from
xsd:anyURI, matching the given pattern (which actually needs to be a bit
more complicated, since schemes are case insensitive).
The key thing to note is that this a subClassOf triple, whereas the
quoted goal #2, actually wants a fixed class definition, as the
intersection of all the restrictions given.
So that, for #B which is defined using only the includeSchemes we would get:
#B owl:sameClassAs _:r .
_:r rdf:type owl:Restriction .
_:r owl:onProperty wdr:hasURI .
_:r owl:someValuesFrom _:d .
_:d rdf:type owl:DataRange .
_:d owl11:derivedFrom xsd:anyURI .
_:d owl11:onFacet xsd:pattern .
_:d owl11:constraint "^(http|https):" .
(only the first triple is different)
And for #A we would define it as being the intersection of the two
restrictions, using an owl:intersectionOf corresponding to the equation
given in
http://www.w3.org/TR/2007/WD-powder-grouping-20071031/#methOutline
[roughly:
RS = DRSI = D1I ∩ D2I ∩ … ∩ DnI = (D1 ∧ D2 ∧ … ∧ Dn)I.
]
As a bullet-proof example of why the example in the grouping document
does not respect the RDF semantics look at:
<wdr:ResourceSet rdf:ID="A">
<wdr:includeSchemes>http https</wdr:includeSchemes>
<wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>
<wdr:ResourceSet rdf:ID="A">
<wdr:includeSchemes>http https</wdr:includeSchemes>
</wdr:ResourceSet>
By RDF Concepts and RDF Semantics the repeated includeSchemes triple
counts only once, so that this is equivalent to:
<wdr:ResourceSet rdf:ID="A">
<wdr:includeSchemes>http https</wdr:includeSchemes>
<wdr:includeHosts>example.org</wdr:includeHosts>
</wdr:ResourceSet>
i.e. http://example.com/ is not part of the resource set
But ... by looking at
<wdr:ResourceSet rdf:ID="A">
<wdr:includeSchemes>http https</wdr:includeSchemes>
</wdr:ResourceSet>
we see that it is ...
So in some way the monotonic discipline of RDF appears to be too severe.
Jeremy