Re: Inference for error checking [was Re: How to avoid that collections break relationships]

2014-04-07 Thread Gannon Dick
Hi Peter,

Data Sets all age at the same rate, (1460 Days + 1 Leap Day per 16 Calendar 
Quarters) or any scalar multiple of that single frequency.  The frequency is 
man-made.  Certainly error checking is good, but cross-domain data transfers 
are only a transportation service via a dumb pipe.  I am wary of added value 
in-transit claims.  They are a delusion that some may find in watches but are 
nowhere to be found in calendars.  

http://www.rustprivacy.org/2014/balance/CulturalHeritageVision.jpg

--Gannon

On Sun, 4/6/14, Peter F. Patel-Schneider pfpschnei...@gmail.com wrote:

 Subject: Re: Inference for error checking [was Re: How to avoid that 
collections  break relationships]
 To: David Booth da...@dbooth.org, Pat Hayes pha...@ihmc.us
 Cc: Markus Lanthaler markus.lantha...@gmx.net, public-hy...@w3.org, 
'public-lod@w3.org' (public-lod@w3.org) public-lod@w3.org, W3C Web Schemas 
Task Force public-voc...@w3.org, Dan Brickley dan...@danbri.org
 Date: Sunday, April 6, 2014, 8:07 PM
 
 Well, certainly, one could do this if
 one wanted to.  However, is this a useful thing to do,
 in general, particularly in the absence of constructs that
 actually sanction the inferenceand particularly if the
 checking is done in a context where there is no way of
 actually getting the author to fix whatever problems are
 encountered?
 
 My feelings are that if you really want to do this, then the
 place to do it isduring data entry or data importation.
 
 
 peter
 
 On 04/03/2014 03:12 PM, David Booth wrote:
  First of all, my sincere apologies to Pat, Peter and
 the rest of the
  readership for totally botching my last example,
 writing domain when
  I meant range *and* explaining it wrong.  Sorry
 for all the confusion it caused!
  
  I was simply trying to demonstrate how a
 schema:domainIncludes
  assertion could be useful for error checking even if it
 had no
  formal entailments, by making selective use of the
 CWA.  I'll
  try again.
  
  Suppose we are given these RDF statements, in which the
 author
  *may* have made a typo, writing ddd instead of ccc as
 the rdf:type
  of x:
  
    x ppp y .       
            
    # Triple A
    x rdf:type ddd .     
           # Triple B
    ppp schema:domainIncludes ccc.  #
 Triple C
  
  As given, these statements are consistent, so a
 reasoner
  will not detect a problem.  Indeed, they may or
 may
  not be what the author intended.  If the author
 later
  added the statement:
  
    ccc owl:equivalentClass ddd
 .   # Triple E
  
  then ddd probably was what the author intended
  in triple B.  OTOH if the author later added:
  
    ccc owl:disjointWith ddd . 
     # Triple F
  
  then ddd probably was not what the author intended
  in triple B.
  
  However, thus far we are only given triples {A,B,C}
  above, and an error checker wishes
  to check for *potential* typos by applying the rule:
  
    For all subgraphs of the form
  
      { x ppp y .
        ppp
 schema:domainIncludes ccc . }
  
    check whether
  
       { x rdf:type ccc . }
  
    is *provably* true.  If not, then
 fail the
    error check.  If all such
 subgraphs pass, then
    the error check as a whole passes.
  
  Under the OWA, the requirement:
  
       { x rdf:type ccc . }
  
  is neither provably true nor provably false given
  graph {A,B,C}.  But under the CWA it is
  considered false, because it is not provably true.
  
  This is how the schema:domainIncludes can be
  useful for error checking even if it has no formal
  entailments: it tells the error checker which
  cases to check.
  
  I hope that now makes more
 sense.   Again, sorry to
  have screwed up my example so badly last time, and
  I hope I've got it right this time.  :)
  
  David
  
  
  On 04/02/2014 11:42 PM, Pat Hayes wrote:
  
  On Mar 31, 2014, at 10:31 AM, David Booth da...@dbooth.org
 wrote:
  
  On 03/30/2014 03:13 AM, Pat Hayes wrote:
  [ , . . ]
  What follows from knowing that
  
  ppp schema:domainIncludes ccc . ?
  
  Suppose you know this and you also know
 that
  
  x ppp y .
  
  Can you infer x rdf:type ccc? I presume
 not, since the domain might
  include other stuff outside ccc. So, what
 *can* be inferred about the
  relationship between x and ccc ? As far as
 I can see, nothing can be
  inferred. If I am wrong, please enlighten
 me. But if I am right, what
  possible utility is there in even making a
 schema:domainIncludes
  assertion?
  
  If inference is too strong, let me weaken
 my question: what
  possible utility **in any way whatsoever**
 is provided by knowing
  that schema:domainIncludes holds between
 ppp and ccc? What software
  can do what with this, that it could not do
 as well without this?
  
  I think I can answer this question quite
 easily, as I have seen it come up before in discussions of
 logic.
  
  ...
  
  Note that this categorization typically relies
 on making a closed world assumption (CWA), which is common
 for an application to make for 

Re: Inference for error checking [was Re: How to avoid that collections break relationships]

2014-04-06 Thread Peter F. Patel-Schneider
Well, certainly, one could do this if one wanted to.  However, is this a 
useful thing to do, in general, particularly in the absence of constructs that 
actually sanction the inferenceand particularly if the checking is done in a 
context where there is no way of actually getting the author to fix whatever 
problems are encountered?


My feelings are that if you really want to do this, then the place to do it 
isduring data entry or data importation.



peter

On 04/03/2014 03:12 PM, David Booth wrote:

First of all, my sincere apologies to Pat, Peter and the rest of the
readership for totally botching my last example, writing domain when
I meant range *and* explaining it wrong.  Sorry for all the confusion it 
caused!


I was simply trying to demonstrate how a schema:domainIncludes
assertion could be useful for error checking even if it had no
formal entailments, by making selective use of the CWA.  I'll
try again.

Suppose we are given these RDF statements, in which the author
*may* have made a typo, writing ddd instead of ccc as the rdf:type
of x:

  x ppp y .   # Triple A
  x rdf:type ddd .# Triple B
  ppp schema:domainIncludes ccc.  # Triple C

As given, these statements are consistent, so a reasoner
will not detect a problem.  Indeed, they may or may
not be what the author intended.  If the author later
added the statement:

  ccc owl:equivalentClass ddd .   # Triple E

then ddd probably was what the author intended
in triple B.  OTOH if the author later added:

  ccc owl:disjointWith ddd .  # Triple F

then ddd probably was not what the author intended
in triple B.

However, thus far we are only given triples {A,B,C}
above, and an error checker wishes
to check for *potential* typos by applying the rule:

  For all subgraphs of the form

{ x ppp y .
  ppp schema:domainIncludes ccc . }

  check whether

 { x rdf:type ccc . }

  is *provably* true.  If not, then fail the
  error check.  If all such subgraphs pass, then
  the error check as a whole passes.

Under the OWA, the requirement:

 { x rdf:type ccc . }

is neither provably true nor provably false given
graph {A,B,C}.  But under the CWA it is
considered false, because it is not provably true.

This is how the schema:domainIncludes can be
useful for error checking even if it has no formal
entailments: it tells the error checker which
cases to check.

I hope that now makes more sense.   Again, sorry to
have screwed up my example so badly last time, and
I hope I've got it right this time.  :)

David


On 04/02/2014 11:42 PM, Pat Hayes wrote:


On Mar 31, 2014, at 10:31 AM, David Booth da...@dbooth.org wrote:


On 03/30/2014 03:13 AM, Pat Hayes wrote:

[ , . . ]
What follows from knowing that

ppp schema:domainIncludes ccc . ?

Suppose you know this and you also know that

x ppp y .

Can you infer x rdf:type ccc? I presume not, since the domain might
include other stuff outside ccc. So, what *can* be inferred about the
relationship between x and ccc ? As far as I can see, nothing can be
inferred. If I am wrong, please enlighten me. But if I am right, what
possible utility is there in even making a schema:domainIncludes
assertion?

If inference is too strong, let me weaken my question: what
possible utility **in any way whatsoever** is provided by knowing
that schema:domainIncludes holds between ppp and ccc? What software
can do what with this, that it could not do as well without this?


I think I can answer this question quite easily, as I have seen it come up 
before in discussions of logic.


...


Note that this categorization typically relies on making a closed world 
assumption (CWA), which is common for an application to make for a 
particular purpose -- especially error checking.


Yes, of course. If you make the CWA with the information you have, then

ppp schema:domainIncludes ccc .

has exactly the same entailments as

ppp rdfs:domain ccc .

has in RDFS without the CWA. But that, of course, begs the question. If you 
are going to rely on the CWA, then (a) you are violating the basic 
assumptions of all Web notations and (b) you are using a fundamentally 
different semantics. And see below.


None of this has anything to do with a distinction between entailment and 
error checking, by the way. Your hypothetical three-way classification task 
uses the same meanings of the RDF as any other entailment task would.




In this example, let us suppose that to pass, the object of every 
predicate must be in the Known Domain of that predicate, where the Known 
Domain is the union of all declared schema:domainIncludes classes for that 
predicate.   (Note the CWA here.)


Given this error checking objective, if a system is given the facts:

  x ppp y .
  y a ccc .

then without also knowing that ppp schema:domainIncludes ccc, the system 
may not be able to determine that these statements should be considered 
Passed or Failed: the result may be Indeterminate.  But if the system is 
also told 

Re: Inference for error checking [was Re: How to avoid that collections break relationships]

2014-04-06 Thread David Booth

On 04/06/2014 09:07 PM, Peter F. Patel-Schneider wrote:

Well, certainly, one could do this if one wanted to.  However, is this a
useful thing to do, in general, particularly in the absence of
constructs that actually sanction the inference and particularly if the
checking is done in a context where there is no way of actually getting
the author to fix whatever problems are encountered?


I'll let others judge that.  My goal in the example was simply to 
demonstrate how it *could* be useful.




My feelings are that if you really want to do this, then the place to do
it is during data entry or data importation.


Sure, it's certainly best to do error checking as early as possible, but 
often there is still some value in doing it later as well.  Maybe the 
data users can contact the data publishers and alert them to a potential 
problem?  But like I say, I'll let others judge its usefulness.  I don't 
have a strong opinion on that.


David




peter

On 04/03/2014 03:12 PM, David Booth wrote:

First of all, my sincere apologies to Pat, Peter and the rest of the
readership for totally botching my last example, writing domain when
I meant range *and* explaining it wrong.  Sorry for all the
confusion it caused!

I was simply trying to demonstrate how a schema:domainIncludes
assertion could be useful for error checking even if it had no
formal entailments, by making selective use of the CWA.  I'll
try again.

Suppose we are given these RDF statements, in which the author
*may* have made a typo, writing ddd instead of ccc as the rdf:type
of x:

  x ppp y .   # Triple A
  x rdf:type ddd .# Triple B
  ppp schema:domainIncludes ccc.  # Triple C

As given, these statements are consistent, so a reasoner
will not detect a problem.  Indeed, they may or may
not be what the author intended.  If the author later
added the statement:

  ccc owl:equivalentClass ddd .   # Triple E

then ddd probably was what the author intended
in triple B.  OTOH if the author later added:

  ccc owl:disjointWith ddd .  # Triple F

then ddd probably was not what the author intended
in triple B.

However, thus far we are only given triples {A,B,C}
above, and an error checker wishes
to check for *potential* typos by applying the rule:

  For all subgraphs of the form

{ x ppp y .
  ppp schema:domainIncludes ccc . }

  check whether

 { x rdf:type ccc . }

  is *provably* true.  If not, then fail the
  error check.  If all such subgraphs pass, then
  the error check as a whole passes.

Under the OWA, the requirement:

 { x rdf:type ccc . }

is neither provably true nor provably false given
graph {A,B,C}.  But under the CWA it is
considered false, because it is not provably true.

This is how the schema:domainIncludes can be
useful for error checking even if it has no formal
entailments: it tells the error checker which
cases to check.

I hope that now makes more sense.   Again, sorry to
have screwed up my example so badly last time, and
I hope I've got it right this time.  :)

David


On 04/02/2014 11:42 PM, Pat Hayes wrote:


On Mar 31, 2014, at 10:31 AM, David Booth da...@dbooth.org wrote:


On 03/30/2014 03:13 AM, Pat Hayes wrote:

[ , . . ]
What follows from knowing that

ppp schema:domainIncludes ccc . ?

Suppose you know this and you also know that

x ppp y .

Can you infer x rdf:type ccc? I presume not, since the domain might
include other stuff outside ccc. So, what *can* be inferred about the
relationship between x and ccc ? As far as I can see, nothing can be
inferred. If I am wrong, please enlighten me. But if I am right, what
possible utility is there in even making a schema:domainIncludes
assertion?

If inference is too strong, let me weaken my question: what
possible utility **in any way whatsoever** is provided by knowing
that schema:domainIncludes holds between ppp and ccc? What software
can do what with this, that it could not do as well without this?


I think I can answer this question quite easily, as I have seen it
come up before in discussions of logic.

...



Note that this categorization typically relies on making a closed
world assumption (CWA), which is common for an application to make
for a particular purpose -- especially error checking.


Yes, of course. If you make the CWA with the information you have, then

ppp schema:domainIncludes ccc .

has exactly the same entailments as

ppp rdfs:domain ccc .

has in RDFS without the CWA. But that, of course, begs the question.
If you are going to rely on the CWA, then (a) you are violating the
basic assumptions of all Web notations and (b) you are using a
fundamentally different semantics. And see below.

None of this has anything to do with a distinction between entailment
and error checking, by the way. Your hypothetical three-way
classification task uses the same meanings of the RDF as any other
entailment task would.



In this example, let us suppose that to pass, the object of every
predicate must be in the 

Re: Inference for error checking [was Re: How to avoid that collections break relationships]

2014-04-03 Thread David Booth

First of all, my sincere apologies to Pat, Peter and the rest of the
readership for totally botching my last example, writing domain when
I meant range *and* explaining it wrong.  Sorry for all the confusion 
it caused!


I was simply trying to demonstrate how a schema:domainIncludes
assertion could be useful for error checking even if it had no
formal entailments, by making selective use of the CWA.  I'll
try again.

Suppose we are given these RDF statements, in which the author
*may* have made a typo, writing ddd instead of ccc as the rdf:type
of x:

  x ppp y .   # Triple A
  x rdf:type ddd .# Triple B
  ppp schema:domainIncludes ccc.  # Triple C

As given, these statements are consistent, so a reasoner
will not detect a problem.  Indeed, they may or may
not be what the author intended.  If the author later
added the statement:

  ccc owl:equivalentClass ddd .   # Triple E

then ddd probably was what the author intended
in triple B.  OTOH if the author later added:

  ccc owl:disjointWith ddd .  # Triple F

then ddd probably was not what the author intended
in triple B.

However, thus far we are only given triples {A,B,C}
above, and an error checker wishes
to check for *potential* typos by applying the rule:

  For all subgraphs of the form

{ x ppp y .
  ppp schema:domainIncludes ccc . }

  check whether

 { x rdf:type ccc . }

  is *provably* true.  If not, then fail the
  error check.  If all such subgraphs pass, then
  the error check as a whole passes.

Under the OWA, the requirement:

 { x rdf:type ccc . }

is neither provably true nor provably false given
graph {A,B,C}.  But under the CWA it is
considered false, because it is not provably true.

This is how the schema:domainIncludes can be
useful for error checking even if it has no formal
entailments: it tells the error checker which
cases to check.

I hope that now makes more sense.   Again, sorry to
have screwed up my example so badly last time, and
I hope I've got it right this time.  :)

David


On 04/02/2014 11:42 PM, Pat Hayes wrote:


On Mar 31, 2014, at 10:31 AM, David Booth da...@dbooth.org wrote:


On 03/30/2014 03:13 AM, Pat Hayes wrote:

[ , . . ]
What follows from knowing that

ppp schema:domainIncludes ccc . ?

Suppose you know this and you also know that

x ppp y .

Can you infer x rdf:type ccc? I presume not, since the domain might
include other stuff outside ccc. So, what *can* be inferred about the
relationship between x and ccc ? As far as I can see, nothing can be
inferred. If I am wrong, please enlighten me. But if I am right, what
possible utility is there in even making a schema:domainIncludes
assertion?

If inference is too strong, let me weaken my question: what
possible utility **in any way whatsoever** is provided by knowing
that schema:domainIncludes holds between ppp and ccc? What software
can do what with this, that it could not do as well without this?


I think I can answer this question quite easily, as I have seen it come up 
before in discussions of logic.

...



Note that this categorization typically relies on making a closed world 
assumption (CWA), which is common for an application to make for a particular 
purpose -- especially error checking.


Yes, of course. If you make the CWA with the information you have, then

ppp schema:domainIncludes ccc .

has exactly the same entailments as

ppp rdfs:domain ccc .

has in RDFS without the CWA. But that, of course, begs the question. If you are 
going to rely on the CWA, then (a) you are violating the basic assumptions of 
all Web notations and (b) you are using a fundamentally different semantics. 
And see below.

None of this has anything to do with a distinction between entailment and error 
checking, by the way. Your hypothetical three-way classification task uses the 
same meanings of the RDF as any other entailment task would.



In this example, let us suppose that to pass, the object of every predicate must be in 
the Known Domain of that predicate, where the Known Domain is the union of 
all declared schema:domainIncludes classes for that predicate.   (Note the CWA here.)

Given this error checking objective, if a system is given the facts:

  x ppp y .
  y a ccc .

then without also knowing that ppp schema:domainIncludes ccc, the system may 
not be able to determine that these statements should be considered Passed or Failed: the 
result may be Indeterminate.  But if the system is also told that

  ppp schema:domainIncludes ccc .

then it can safely categorize these statements as Passed (within the limits of 
this error checking).


Why? [ y a cc . ] does not follow from this assertion and the x ppp y, so this 
looks like an Indeterminate to me. Even with the CWA applied to ppp, your check 
here is extremely risky. In fact, I could invoke Gricean reasoning to conclude 
that the domain of ppp **almost certainly must** include something outside ccc; 
because if not, why did whoever wrote this use 

Re: Inference for error checking [was Re: How to avoid that collections break relationships]

2014-04-02 Thread Peter F. Patel-Schneider


On 03/31/2014 01:39 PM, David Booth wrote:

On 03/31/2014 11:59 AM, Peter F. Patel-Schneider wrote:

[...]

Given this error checking objective, if a system is given the facts:

  x ppp y .
  y a ccc .

then without also knowing that ppp schema:domainIncludes ccc, the
system may not be able to determine that these statements should be
considered Passed or Failed: the result may be Indeterminate. But if
the system is also told that

  ppp schema:domainIncludes ccc .

then it can safely categorize these statements as Passed (within the
limits of this error checking).

Sure, but it can be very tricky to determine just what facts to consider
when making this determination, particularly with the upside-down nature
of schema:domainIncludes


My assumption in this example is that the application already has a set of 
assertions that it intends to work with, and it wishes to error check them.


It is quite tricky to figure out what this set of assertions should be?  For 
example, are consequences of other facts allowed?  All of them?




Thus, although schema:domainIncludes does not enable any new
entailments under the open world assumption (OWA), it *does* enable
some useful error checking inference under the closed world assumption
(CWA), by enabling a shift from Indeterminate to Passed or Failed.

The CWA actually works against you here.  Given the following triples,

x ppp y .   # Triple A
y rdf:type ddd .# Triple B
ppp schema:domainIncludes ccc.  # Triple C

you are determining whether

y rdf:type ccc. # Triple E

is entailed, whether its negation is entailed, or neither.  The relevant
CWA would push these last two together, making it impossible to have a
three-way determination, which you want.


I don't think that's quite it.  The error check that I described is not the 
same as checking whether NOT(y rdf:type ccc) is entailed.  (Such a 
conclusion could be entailed if there were an owl:disjointWith assertion, 
for example.)  It is checking whether (y rdf:type KnownDomain(ppp)).  In 
other words, the CWA is not being made in testing whether (y rdf:type ccc); 
rather it is being made in computing KnownDomain(ppp).


Huh?  What is this KnownDomain construct?  Where does it come from? How is it 
computed?


The net effect of this is that the CWA is being used to distinguish between 
cases that would all be considered unknown under the OWA.


I still don't see a play for the CWA here.


David


peter




Re: Inference for error checking [was Re: How to avoid that collections break relationships]

2014-04-02 Thread Pat Hayes

On Mar 31, 2014, at 10:31 AM, David Booth da...@dbooth.org wrote:

 On 03/30/2014 03:13 AM, Pat Hayes wrote:
 [ , . . ]
  What follows from knowing that
 
 ppp schema:domainIncludes ccc . ?
 
 Suppose you know this and you also know that
 
 x ppp y .
 
 Can you infer x rdf:type ccc? I presume not, since the domain might
 include other stuff outside ccc. So, what *can* be inferred about the
 relationship between x and ccc ? As far as I can see, nothing can be
 inferred. If I am wrong, please enlighten me. But if I am right, what
 possible utility is there in even making a schema:domainIncludes
 assertion?
 
 If inference is too strong, let me weaken my question: what
 possible utility **in any way whatsoever** is provided by knowing
 that schema:domainIncludes holds between ppp and ccc? What software
 can do what with this, that it could not do as well without this?
 
 I think I can answer this question quite easily, as I have seen it come up 
 before in discussions of logic.
 
 ...

 Note that this categorization typically relies on making a closed world 
 assumption (CWA), which is common for an application to make for a particular 
 purpose -- especially error checking.

Yes, of course. If you make the CWA with the information you have, then 

ppp schema:domainIncludes ccc .

has exactly the same entailments as 

ppp rdfs:domain ccc .

has in RDFS without the CWA. But that, of course, begs the question. If you are 
going to rely on the CWA, then (a) you are violating the basic assumptions of 
all Web notations and (b) you are using a fundamentally different semantics. 
And see below.

None of this has anything to do with a distinction between entailment and error 
checking, by the way. Your hypothetical three-way classification task uses the 
same meanings of the RDF as any other entailment task would.

 
 In this example, let us suppose that to pass, the object of every predicate 
 must be in the Known Domain of that predicate, where the Known Domain is 
 the union of all declared schema:domainIncludes classes for that predicate.   
 (Note the CWA here.)
 
 Given this error checking objective, if a system is given the facts:
 
  x ppp y .
  y a ccc .
 
 then without also knowing that ppp schema:domainIncludes ccc, the system 
 may not be able to determine that these statements should be considered 
 Passed or Failed: the result may be Indeterminate.  But if the system is also 
 told that
 
  ppp schema:domainIncludes ccc .
 
 then it can safely categorize these statements as Passed (within the limits 
 of this error checking).

Why? [ y a cc . ] does not follow from this assertion and the x ppp y, so this 
looks like an Indeterminate to me. Even with the CWA applied to ppp, your check 
here is extremely risky. In fact, I could invoke Gricean reasoning to conclude 
that the domain of ppp **almost certainly must** include something outside ccc; 
because if not, why did whoever wrote this use the more cautious 
schema:domainIncludes rather than the simpler and more direct rdfs:domain? 
Indeed, isnt the ubiquity of the OWA in Web reasoning the only justification 
for having a construct like schema:domainIncludes at all? Why else was it 
invented, if not to allow for further information to make the domain larger?

 Thus, although schema:domainIncludes does not enable any new entailments 
 under the open world assumption (OWA), it *does* enable some useful error 
 checking inference under the closed world assumption (CWA), by enabling a 
 shift from Indeterminate to Passed or Failed.

I would not want any important decision to rest on such an extremely flaky 
foundation as this. 

 
 If anyone is concerned that this use of the CWA violates the spirit of RDF, 
 which indeed is based on the OWA (for *very* good reason), please bear in 
 mind that almost every application makes the CWA at some point, to do its job.

Um, bullshit. But in any case, even if it were true, the important thing is to 
know when to invoke the CWA. Assuming that you know all the domain, when you 
have been told explicitly that you probably have not been told all of it, is a 
very bad heuristic for invoking the CWA. 

Pat

 
 David
 
 


IHMC (850)434 8903 home
40 South Alcaniz St.(850)202 4416   office
Pensacola(850)202 4440   fax
FL 32502  (850)291 0667   mobile (preferred)
pha...@ihmc.us   http://www.ihmc.us/users/phayes









Inference for error checking [was Re: How to avoid that collections break relationships]

2014-03-31 Thread David Booth

On 03/30/2014 03:13 AM, Pat Hayes wrote:

[ , . . ]

 What follows from knowing that


ppp schema:domainIncludes ccc . ?

Suppose you know this and you also know that

x ppp y .

Can you infer x rdf:type ccc? I presume not, since the domain might
include other stuff outside ccc. So, what *can* be inferred about the
relationship between x and ccc ? As far as I can see, nothing can be
inferred. If I am wrong, please enlighten me. But if I am right, what
possible utility is there in even making a schema:domainIncludes
assertion?

If inference is too strong, let me weaken my question: what
possible utility **in any way whatsoever** is provided by knowing
that schema:domainIncludes holds between ppp and ccc? What software
can do what with this, that it could not do as well without this?


I think I can answer this question quite easily, as I have seen it come 
up before in discussions of logic.


Entailment produces statements that are known to be true, given a set of 
facts and entailment rules.  And indeed, adding the fact that


  ppp schema:domainIncludes ccc .

to a set of facts produces no new entailments in that sense.  But it 
*does* enable another kind of very useful machine-processable inference 
that is useful in error checking, which I'll describe.


In error checking, it is sometimes useful to classify a set of 
statements into three categories: Passed, Failed or Indeterminate. 
Passed means that the statements are fine (within the checkable limits 
anyway): sufficient information has been provided, and it is internally 
consistent.  Failed means that there is something malformed about them 
(according to the application's purpose).  Indeterminate means that the 
system does not have enough information to know whether the statements 
are okay or not: further work might need to be performed, such as manual 
examination or adding more information (facts) to the system.  Hence, it 
is *useful* to be able to quickly and automatically establish that the 
statements fall into the Passed or Failed category.


Note that this categorization typically relies on making a closed world 
assumption (CWA), which is common for an application to make for a 
particular purpose -- especially error checking.


In this example, let us suppose that to pass, the object of every 
predicate must be in the Known Domain of that predicate, where the 
Known Domain is the union of all declared schema:domainIncludes classes 
for that predicate.   (Note the CWA here.)


Given this error checking objective, if a system is given the facts:

  x ppp y .
  y a ccc .

then without also knowing that ppp schema:domainIncludes ccc, the 
system may not be able to determine that these statements should be 
considered Passed or Failed: the result may be Indeterminate.  But if 
the system is also told that


  ppp schema:domainIncludes ccc .

then it can safely categorize these statements as Passed (within the 
limits of this error checking).


Thus, although schema:domainIncludes does not enable any new entailments 
under the open world assumption (OWA), it *does* enable some useful 
error checking inference under the closed world assumption (CWA), by 
enabling a shift from Indeterminate to Passed or Failed.


If anyone is concerned that this use of the CWA violates the spirit of 
RDF, which indeed is based on the OWA (for *very* good reason), please 
bear in mind that almost every application makes the CWA at some point, 
to do its job.


David



Re: Inference for error checking [was Re: How to avoid that collections break relationships]

2014-03-31 Thread Peter F. Patel-Schneider


On 03/31/2014 08:31 AM, David Booth wrote:

On 03/30/2014 03:13 AM, Pat Hayes wrote:

[ , . . ]

 What follows from knowing that


ppp schema:domainIncludes ccc . ?

Suppose you know this and you also know that

x ppp y .

Can you infer x rdf:type ccc? I presume not, since the domain might
include other stuff outside ccc. So, what *can* be inferred about the
relationship between x and ccc ? As far as I can see, nothing can be
inferred. If I am wrong, please enlighten me. But if I am right, what
possible utility is there in even making a schema:domainIncludes
assertion?

If inference is too strong, let me weaken my question: what
possible utility **in any way whatsoever** is provided by knowing
that schema:domainIncludes holds between ppp and ccc? What software
can do what with this, that it could not do as well without this?


I think I can answer this question quite easily, as I have seen it come up 
before in discussions of logic.


Entailment produces statements that are known to be true, given a set of 
facts and entailment rules.  And indeed, adding the fact that


  ppp schema:domainIncludes ccc .

to a set of facts produces no new entailments in that sense. 


Is it then your contention that schema:domainIncludes does not add any new 
entailments under the schema.org semantics?



But it *does* enable another kind of very useful machine-processable 
inference that is useful in error checking, which I'll describe.


In error checking, it is sometimes useful to classify a set of statements 
into three categories: Passed, Failed or Indeterminate. Passed means that 
the statements are fine (within the checkable limits anyway): sufficient 
information has been provided, and it is internally consistent.  Failed 
means that there is something malformed about them (according to the 
application's purpose). Indeterminate means that the system does not have 
enough information to know whether the statements are okay or not: further 
work might need to be performed, such as manual examination or adding more 
information (facts) to the system. Hence, it is *useful* to be able to 
quickly and automatically establish that the statements fall into the Passed 
or Failed category.


Note that this categorization typically relies on making a closed world 
assumption (CWA), which is common for an application to make for a 
particular purpose -- especially error checking.


I don't see that the CWA is particularly germane here, except that most 
formalisms that do this sort of checking also utilize some sort of CWA.   
There is notthing wrong with performing this sort of analysis in formalisms 
that do not have any form of CWA.  What does cause problems with this sort of 
analysis is the presence of non-trivial inference.


In this example, let us suppose that to pass, the object of every predicate 
must be in the Known Domain of that predicate, where the Known Domain is 
the union of all declared schema:domainIncludes classes for that 
predicate.   (Note the CWA here.)


Given this error checking objective, if a system is given the facts:

  x ppp y .
  y a ccc .

then without also knowing that ppp schema:domainIncludes ccc, the system 
may not be able to determine that these statements should be considered 
Passed or Failed: the result may be Indeterminate.  But if the system is 
also told that


  ppp schema:domainIncludes ccc .

then it can safely categorize these statements as Passed (within the limits 
of this error checking).


Sure, but it can be very tricky to determine just what facts to consider when 
making this determination, particularly with the upside-down nature of 
schema:domainIncludes


Thus, although schema:domainIncludes does not enable any new entailments 
under the open world assumption (OWA), it *does* enable some useful error 
checking inference under the closed world assumption (CWA), by enabling a 
shift from Indeterminate to Passed or Failed.

The CWA actually works against you here.  Given the following triples,

x ppp y .
y rdf:type ddd .
ppp schema:domainIncludes ccc.

you are determining whether

y rdf:type ccc.

is entailed, whether its negation is entailed, or neither.  The relevant CWA 
would push these last two together, making it impossible to have a three-way 
determination, which you want.




If anyone is concerned that this use of the CWA violates the spirit of RDF, 
which indeed is based on the OWA (for *very* good reason), please bear in 
mind that almost every application makes the CWA at some point, to do its job.


David


peter