AW: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-19 Thread Chris Bizer

Hi Dan and all,

it looks to me as we try to solve a variety of different use cases with a
single solution and thus run into problems here.

There are three separate use cases that people participating in the
discussion seem to have in mind:

1. Visualization of the data
2. Consistency checking
3. Interlinking ontologies/schemata on the Web as basis for data integration


For visualization, range and domain constrains are somehow useful (as TimBL
said), but this usefulness is very indirect.
For instance, even simple visualizations will need to put the large number
of DBpedia properties into a proper order and ideally would also support
views on different levels of detail. Both things where range and domain
don't help much, but which are covered by other technologies like Fresnel
(http://www.w3.org/2005/04/fresnel-info/manual/). So for visualization, I
think it would be more useful if we would start publishing Fresnel lenses
for each class in the Dbpedia ontology.

As Jens said, the domains and ranges can be used for checking instance data
against the class definitions and thus detect inconsistencies (this usage is
not really covered by the RDFS specification as Paul remarked, but still
many people do this). As Wikipedia contains a lot of inconsistencies and as
we don't want to reduce the amount of extracted information too much, we
decided to publish the loose instance dataset which also contains property
values that might violate the contrains. I say "might" as we only know for
sure that something is a person if the Wikipedia article contains a
person-related template. If it does not, the thing could be a person or not.

Which raises the question: Is it better for DBpedia to keep the constraints
and publish instance data that might violate these constraints or is it
better to loosen the constraints and remove the inconsistencies this way? Or
keep things as they are, knowing that range and domain statements are anyway
hardly used by existing Semantic Web applications that work with data from
the public Web? (Are there any? FalconS?)

For the third use case of interlinking ontologies/schemata on the Web in
order to integrate instance data afterwards, it could be better to remove
the domain and range statements as this prevents inconsistencies when
ontologies/schemata are interlinked. On the other hand it is likely that the
trust layers of Web data integration frameworks will ignore the domain and
range statements anyway and concentrate more on owl:sameAs, subclass and
subproperty. Again, Falcons and Sindice and SWSE teams, do you use domain
and range statements when cleaning up the data that you crawled from the
Web?

I really like Hugh's idea of having a loose schema in general and add
additional constraints as comments/optional constraints to the schema, so
that applications can decide whether they want to use them or not. But this
is sadly not supported by the RDF standards.

So, I'm still a bit undecided about leaving or removing the ranges and
domains. Maybe leave them, as they are likely not harmful and might be
useful for some use cases?

Cheers

Chris


> -Ursprüngliche Nachricht-
> Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Im Auftrag von Dan Brickley
> Gesendet: Mittwoch, 19. November 2008 14:09
> An: Pierre-Antoine Champin
> Cc: Paul Gearon; Semantic Web
> Betreff: Re: Domain and range are useful Re: DBpedia 3.2 release,
> including DBpedia Ontology and RDF links to Freebase
> 
> 
> Pierre-Antoine Champin wrote:
> > Paul Gearon a écrit :
> >> While I'm here, I also noticed Tim Finin referring to "domain and
> range
> >> constraints". Personally, I don't see the word "constraint" as an
> >> appropriate description, since rdfs:domain and rdfs:range are not
> >> constraining in any way.
> >
> > They are constraining the set of interpretations that are models of
> your
> > knowledge base. Namely, you constrain Fido to be a person...
> >
> > But I grant you this is not exactly what most people expect from the
> > term "constraint"... I also had to do the kind of explainations you
> > describe...
> 
> 
> Yes, exactly.
> 
> In earlier (1998ish) versions of RDFS we called them 'constraint
> resources' (with the anticipation of using that concept to flag up new
> constructs from anticipated developments like DAML+OIL and OWL). This
> didn't really work, because anything that had a solid meaning was a
> constraint in this sense, so we removed that wording.
> 
> This is a very interesting discussion, wish I had time this week to
> jump
> in further.
> 
> I do recommend against using RDFS/OWL to express application/dataset
> constraints, while recognising that there's a real need for recording
> them in machine-friendly form. I

Re: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-18 Thread Pierre-Antoine Champin

Ian Davis a écrit :
> 
> On Tue, Nov 18, 2008 at 4:02 AM, Tim Berners-Lee <[EMAIL PROTECTED]
> > wrote:
> 
> 
> On 2008-11 -17, at 11:27, John Goodwin wrote:
>> [...]
>> I'd be tempted to generalise or just remove the domain/range
>> restrictions. Any thoughts?
> 
> There are lots of uses for rand and domain.
> 
> One is in the user interface -- if you for example link a a person
> and a document, the system
> can prompt you for a relationship which will include "is author of"
> and "made" but won't include foaf:knows or is issue of.
> 
> Similarly, when making a friend, one can us autocompletion on labels
> which the current session knows about and simplify it by for example
> removing all documents from a list of candidate foaf:knows friends.
> 
> 
> Both these use cases require some OWL to say that documents aren't
> people. I don't see these scenarios being feasible in the general case
> because you'd need a complete description of the world in OWL, i.e.
> you'd want to know about everything that can't possibly be a person.

This is technically true.
However, from a user interface point of view, it is reasonable to use
the *explicit* statements as a guiding heuristic -- although it should
be possible, with additional steps, to add a foaf:knows bewteen any two
resources, even if one is not explicitly typed as a foaf:Person.

  pa



Re: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-18 Thread Paul Gearon


On Nov 18, 2008, at 1:32 AM, Ian Davis wrote:



On Tue, Nov 18, 2008 at 4:02 AM, Tim Berners-Lee <[EMAIL PROTECTED]> wrote:

On 2008-11 -17, at 11:27, John Goodwin wrote:

[...]
I'd be tempted to generalise or just remove the domain/range
restrictions. Any thoughts?



There are lots of uses for rand and domain.

One is in the user interface -- if you for example link a a person  
and a document, the system
can prompt you for a relationship which will include "is author of"  
and "made" but won't include foaf:knows or is issue of.


Similarly, when making a friend, one can us autocompletion on labels  
which the current session knows about and simplify it by for example  
removing all documents from a list of candidate foaf:knows friends.


Both these use cases require some OWL to say that documents aren't  
people. I don't see these scenarios being feasible in the general  
case because you'd need a complete description of the world in OWL,  
i.e. you'd want to know about everything that can't possibly be a  
person.


But this is true for OWL in general anyway. If you really need to  
tighten down your type (a la the RDBMS world) then you do need to  
describe things as disjoint (either explicitly, or through other  
mechanisms such as the subclass of a complement).


Using OWL still has a lot of utility, but if you absolutely require  
this level of description, then perhaps a relational database would be  
more appropriate for the application at hand. (The right tool for the  
right job, and all that)  :-)





It is of course also important for checking hand-written files for  
validity.


Again, isn't validity checking something that can only be done with  
OWL. RDFS only adds for information.


Agreed.

The most common example I see of this is people thinking that if I say  
that ns:name has a domain of ns:Person, then it will be an error to  
give a ns:name to a ns:Dog. Instead, it just means that Fido becomes  
both a ns:Person and a ns:Dog (and I'm sure that you, like me, have  
had to explain why a reasoner has not reported this as an error). This  
appears to be one of the reasons why many applications choose separate  
URIs for predicates when applied to different types (e.g. ns:Person/ 
name and ns:Dog/name), thereby sidestepping many of the issues.


While I'm here, I also noticed Tim Finin referring to "domain and  
range constraints". Personally, I don't see the word "constraint" as  
an appropriate description, since rdfs:domain and rdfs:range are not  
constraining in any way.


Regards,
Paul Gearon

Re: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-18 Thread Tim Finin


This is an interesting discussion.  By coincidence, yesterday Tom
Briggs [1] defended his dissertation [2] on 'Constraint Generation and
Reasoning in OWL' which was done with Professor Yun Peng [3].  He
started with an analysis of Swoogle's data that showed that 75% of
published Semantic Web properties have neither domain or range
constraints and evaluated algorithms for inferring them.  Rather than
focusing on instance data, he looked at what could be learned from how
the properties were used in the TBOX, e.g., for specifying role
restrictions.  He has a paper on this that he has submitted to a
conference and should finish revising his dissertation in the next few
weeks.  Here is the abstract for his defense:

 Constraint Generation and Reasoning in OWL
 Thomas H. Briggs

 The majority of OWL ontologies in the emerging Semantic Web are
 constructed from properties that lack domain and range
 constraints. Constraints in OWL are different from the familiar uses
 in programming languages and databases, and are actually type
 assertions that are made about the individuals which are connected
 by the property. These assertions can add vital information to the
 model because they are assertions of type on the individuals
 involved, and they can also give information on how the defining
 property may be used.

 Three different automated generation techniques are explored in this
 research: disjunction, least-common named subsumer, and
 vivification. Each algorithm is compared for the ability to
 generalize, and the performance impacts with respect to the
 reasoner. A large sample of ontologies from the Swoogle repository
 are used to compare real-world performance of these techniques.

 Finally, using generated facts, a type of default reasoning, may
 conflict with future assertions to the knowledge base. While general
 default reasoning is non-monotonic and undecidable a novel approach
 is introduced to support efficient retraction of the default
 knowledge. Combined, these techniques enable a robust and efficient
 generation of domain and range constraints which will result in
 inference of additional facts and improved performance for a number
 of Semantic Web applications.

[1] http://ebiquity.umbc.edu/person/html/Tom/Briggs/
[2] http://ebiquity.umbc.edu/event/html/id/273/
[3] http://ebiquity.umbc.edu/person/html/Yun/Peng/


--
Tim Finin, Computer Science & Electrical Engineering, Univ of Maryland
Baltimore County, 1000 Hilltop Cir, Baltimore MD 21250. [EMAIL PROTECTED]
http://umbc.edu/~finin 410-455-3522 fax:-3969 http://ebiquity.umbc.edu 



RE: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-18 Thread John Goodwin


Ian Davies wrote:

> Again, isn't validity checking something that can only be done with
OWL. RDFS only adds for > information.

I think strictly speaking both OWL and RDFS only add information, but
with OWL you can at least check that the information is logically
consistent and use to validation in some sense. I think to use domain
and range to get validation you really need OWL disjoints as well. I
guess in theory you could check the inferred types for instances based
on the range/domain restriction but that could be pretty tedious for a
large abox!?

John



.


This email is only intended for the person to whom it is addressed and may 
contain confidential information. If you have received this email in error, 
please notify the sender and delete this email which must not be copied, 
distributed or disclosed to any other person.

Unless stated otherwise, the contents of this email are personal to the writer 
and do not represent the official view of Ordnance Survey. Nor can any contract 
be formed on Ordnance Survey's behalf via email. We reserve the right to 
monitor emails and attachments without prior notice.

Thank you for your cooperation.

Ordnance Survey
Romsey Road
Southampton SO16 4GU
Tel: 08456 050505
http://www.ordnancesurvey.co.uk




Re: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-17 Thread Ian Davis
On Tue, Nov 18, 2008 at 4:02 AM, Tim Berners-Lee <[EMAIL PROTECTED]> wrote:

>
> On 2008-11 -17, at 11:27, John Goodwin wrote:
>
> [...]
> I'd be tempted to generalise or just remove the domain/range
> restrictions. Any thoughts?
>
>
> There are lots of uses for rand and domain.
> One is in the user interface -- if you for example link a a person and a
> document, the system
> can prompt you for a relationship which will include "is author of" and
> "made" but won't include foaf:knows or is issue of.
>
> Similarly, when making a friend, one can us autocompletion on labels which
> the current session knows about and simplify it by for example removing all
> documents from a list of candidate foaf:knows friends.
>

Both these use cases require some OWL to say that documents aren't people. I
don't see these scenarios being feasible in the general case because you'd
need a complete description of the world in OWL, i.e. you'd want to know
about everything that can't possibly be a person.



>
> It is of course also important for checking hand-written files for
> validity.
>

Again, isn't validity checking something that can only be done with OWL.
RDFS only adds for information.



>
> Tim BL
>

Ian


Re: Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-17 Thread Peter Ansell
2008/11/18 Tim Berners-Lee <[EMAIL PROTECTED]>

>
> On 2008-11 -17, at 11:27, John Goodwin wrote:
>
> [...]
> I'd be tempted to generalise or just remove the domain/range
> restrictions. Any thoughts?
>
>
> There are lots of uses for rand and domain.
> One is in the user interface -- if you for example link a a person and a
> document, the system
> can prompt you for a relationship which will include "is author of" and
> "made" but won't include foaf:knows or is issue of.
>
> Similarly, when making a friend, one can us autocompletion on labels which
> the current session knows about and simplify it by for example removing all
> documents from a list of candidate foaf:knows friends.
>
> It is of course also important for checking hand-written files for
> validity.
>
> Tim BL
>

I think there are uses as you demonstrate, but checking hand written
(effectively) Wikipedia extracts for validity is practically impossible. It
may happen that assuming the dbpedia range and domain are correct that you
might not have access to something you would otherwise have correctly been
able to see in a given situation. Maybe the best idea is to leave the range
and domain in dbpedia as naive possibilities and people who are worried that
they are not going to be correct enough for the level of risk in their
application can ignore them for the dbpedia data set.

Cheers,

Peter


Domain and range are useful Re: DBpedia 3.2 release, including DBpedia Ontology and RDF links to Freebase

2008-11-17 Thread Tim Berners-Lee


On 2008-11 -17, at 11:27, John Goodwin wrote:

[...]
I'd be tempted to generalise or just remove the domain/range
restrictions. Any thoughts?



There are lots of uses for rand and domain.

One is in the user interface -- if you for example link a a person and  
a document, the system
can prompt you for a relationship which will include "is author of"  
and "made" but won't include foaf:knows or is issue of.


Similarly, when making a friend, one can us autocompletion on labels  
which the current session knows about and simplify it by for example  
removing all documents from a list of candidate foaf:knows friends.


It is of course also important for checking hand-written files for  
validity.


Tim BL