Re: [Wikidata-l] What is the point of properties?

2014-06-06 Thread Jean-Baptiste Pressac

Hello,
Concerning the use of owl:sameAs 
http://www.w3.org/TR/owl-ref/#IndividualIdentity, it is used in 
dbpedia to link for instance http://dbpedia.org/page/Joseph_Hocking to 
its equivalent in Freebase, WikiData and Yago. If we refer to your 
remark, Markus, this is not an example to follow ?


If the use of owl:sameAs 
http://www.w3.org/TR/owl-ref/#IndividualIdentity is discouraged, what 
is its purpose and in which case could it be used ? Does this means that 
OWL lacks a proper way to interlink ressources from different editors ?


By the way, is the notion of individuals an OWL concept ?

Alternative ways to interlink data could also be found on : 
http://notes.3kbo.com/owl-sameas.


Jean-Baptiste Pressac
Traitement et analyse de bases de données
Centre de Recherche Bretonne et Celtique
UMS 3554
20 rue Duquesne
CS 93837
29238 Brest cedex 3

tel : +33 (0)2 98 01 68 95
fax : +33 (0)2 98 01 63 93



attachment: Jean-Baptiste_Pressac.vcf___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-30 Thread Markus Krötzsch

On 29/05/14 21:04, Andrew Gray wrote:

One other issue to bear in mind: it's *simple* to have properties as a
separate thing. I have been following this discussion with some
interest but... well, I don't think I'm particularly stupid, but most
of it is completely above my head.

Saying here are items, here are a set of properties you can define
relating to them, here's some notes on how to use properties is going
to get a lot more people able to contribute than if they need to start
understanding theoretical aspects of semantic relationships...


Good point. The thread has really gone off in a rather philosophical 
direction :-) As Jane said, examples (of places where a property should 
be used *and* of places where it should not be used) are definitely much 
more useful to help our editors on the ground. I usually use items I 
know as role models or have a look for suitable showcase items.


Markus



On 28 May 2014 09:37, Daniel Kinzler daniel.kinz...@wikimedia.de wrote:

Key differences between Properties and Items:

* Properties have a data type, items don't.
* Items have sitelinks, Properties don't.
* Items have Statements, Properties will support Claims (without sources).

The software needs these constraints/guarantees to be able to take shortcuts,
provide specialized UI and API functionality, etc.

Yes, it would be possible to use items as properties instead of having a
separate entity type. But they are structurally and functionally different, so
it makes sense to have a strict separate. This makes a lot of things easier, 
e.g.:

* setting different permissions for properties
* mapping to rdf vocabularies

More fundamentally, they are semantically different: an item describes a concept
in the real world, while a property is a structural component used for such a
description.

Yes, properies are simmilar to data items, and in some cases, there may be an
item representing the same concept that is represented by a property entity. I
don't see why that is a problem, while I can see a lot of confusion arising from
mixing them.

-- daniel


Am 28.05.2014 09:25, schrieb David Cuenca:

Since the very beginning I have kept myself busy with properties, thinking about
which ones fit, which ones are missing to better describe reality, how integrate
into the ones that we have. The thing is that the more I work with them, the
less difference I see with normal items and if soon there will be statements
allowed in property pages, the difference will blur even more.
I can understand that from the software development point of view it might make
sense to have a clear difference. Or for the community to get a deeper
understanding of the underlying concepts represented by words.

But semantically I see no difference between:
cement (Q45190) emissivity (P1295) 0.54
and
cement (Q45190) emissivity (Q899670) 0.54

Am I missing something here? Are properties really needed or are we adding
unnecessary artificial constraints?

Cheers,
Micru


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l




--
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l







___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-30 Thread David Cuenca
On Thu, May 29, 2014 at 9:04 PM, Andrew Gray andrew.g...@dunelm.org.uk
wrote:

 One other issue to bear in mind: it's *simple* to have properties as a
 separate thing. I have been following this discussion with some
 interest but... well, I don't think I'm particularly stupid, but most
 of it is completely above my head.

 Saying here are items, here are a set of properties you can define
 relating to them, here's some notes on how to use properties is going
 to get a lot more people able to contribute than if they need to start
 understanding theoretical aspects of semantic relationships...


Definitely, I cannot agree more. TBH, the original question of this thread
was already settled some messages ago.
I understand that it might result confusing that we have wandered off into
other realms, so I consider that it is better to consider this thread
closed and I will consider opening a new one with the right topic (which is
quite different as it started :-P)

Cheers,
Micru
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-30 Thread David Cuenca
And to summarize the answer of the original question to future readers. The
point of properties is:
a) to help humans to better understand Wikidata
b) to help programmers (also humans :P) build the software running it
c) to make a distinction between concepts found in the world and the
concepts that have been interiorized by the community

There might be more, but those are the main points that suggest that it is
better to keep properties and items separate even if their essence is the
same.

Thank you all for this learning experience :-)

Micru
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-30 Thread Lydia Pintscher
On Fri, May 30, 2014 at 10:06 AM, Andrew Gray andrew.g...@dunelm.org.uk wrote:
 Do we have an easy way of highlighting a gallery of good examples or even a
 plain wikipage of topical guidance? Would be very useful if we could say
 'here's a politician, here's a French city, etc'

https://www.wikidata.org/wiki/Wikidata:Showcase_items :)


-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-29 Thread Thomas Douillard
@David:
I think you should have a look to fuzzy logic
https://www.wikidata.org/wiki/Q224821 :)


2014-05-29 1:48 GMT+02:00 David Cuenca dacu...@gmail.com:

 Markus,


 On Thu, May 29, 2014 at 12:53 AM, Markus Krötzsch 
 mar...@semantic-mediawiki.org wrote:

 This is an easy question once you have been clear about what human
 behaviour is. According to enwiki, it is a range of behaviours *exhibited
 by* humans.


 Settled :) Let's leave it at defined as a trait of


 What would anybody do with this data? In what application could it be of
 interest?


 Well, our goal it to gather the whole human knowledge, not to use it. I
 can think of several applications, but let's leave that open. Never
 underestimate human creativity ;-)



 Moreover, as a great Icelandic ontologist once said: There is
 definitely, definitely, definitely no logic, to human behaviour ;-)


 Definitely, that is why we spend so much time in front of flickering
 squares making them flicker even more. It makes total sense :P



 I think constraints are already understood in this way. The name comes
 from databases, where a constraint violation is indeed a rather hard
 error. On the other hand, ironically, constraints (as a technical term) are
 often considered to be a softer form of modelling than (onto)logical
 axioms: a constraint can be violated while a logical axiom (as the name
 suggests) is always true -- if it is not backed by the given data, new data
 will be inferred. So as a technical term, constraint is quite appropriate
 for the mechanism we have, although it may not be the best term to clarify
 the intention.


 Ok, I will not fight traditional labels nor conventions. I was interested
 in pointing out to the inappropriateness of using a word inside our
 community with a definition that doesn't matches its use, when there is
 another word that matches perfectly and conveys its meaning better to users.

 Some important ideas like classification (instance of/subclass of) belong
 completely to the analytical realm. We don't observe classes, we define
 them. A planet is what we call a planet, and this can change even if the
 actual lumps in space are pretty much the same.


 Agreed. Better labels could be defined as instance of/defined as
 subclass of


 Now inferences are slightly different. If we know that X implies Y, then
 if A says X we can infer that (implicitly) A says Y. That is a logical
 relationship (or rule) on the level of what is claimed, rather than on the
 level of statements. Note that we still need to have a way to find out that
 X implies Y, which is a content-level claim that should have its own
 reference somewhere. We mainly use inference in this sense with subclass
 of in reasonator or when checking constraints. In this case, the
 implications are encoded as subclass-of statements (If X is a piano, then
 X is an instrument). This allows us to have references on the implications.


 Nope, nope, nope. I was not referring to hard implications, but to
 heuristic ones.

 Consider that these properties in the item namespace:
 defined as a trait of
 defined as having
 defined as instance of

 Would translate as these constraints in the property namespace:
 likely to be a trait of
 likely to have
 likely to be an instance of



 In general, an interesting question here is what the status of subclass
 of really is. Do we gather this information from external sources (surely
 there must be a book that tells us that pianos are instruments) or do we as
 a community define this for Wikidata (surely, the overall hierarchy we get
 is hardly the universal class hierarchy of the world but a very specific
 classification that is different from other classifications that may exist
 elsewhere)? Best not to think about it too much and to gather sources
 whenever we have them ;-)


 I think it is good to think about it and to consider options to deal with
 it. Like for instance:
 defined as instance of corresponds with item Wikimedia community
 concept
 We already have items that refer to concepts that only make sense for us,
 so no change in that regard.

 At the moment, hard constraints (from definitions) and soft constraints
 (expectations) are simply mixed, and maybe this is fine since we handle
 them in a similar fashion (humans need to look how to fix the situation).
 Most constraints, even those that refer to definitions, are rather soft
 anyway since we apply them to statements, not to hard facts. Hard
 constraints can only occur in cases where the *encoding* of a statement in
 Wikidata is wrong (not the intended statement as such, but how it was
 translated to data).


 As explained above, expectations inferred from definitions should not be
 treated as hard constraints, but as soft ones.

 Micru

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l 

Re: [Wikidata-l] What is the point of properties?

2014-05-29 Thread Markus Krötzsch

On 29/05/14 12:41, Thomas Douillard wrote:

@David:
I think you should have a look to fuzzy logic
https://www.wikidata.org/wiki/Q224821:)


Or at probabilistic logic, possibilistic logic, epistemic logic, ... 
it's endless. Let's first complete the data we are sure of before we 
start to discuss whether Pluto is a planet with fuzzy degree 0.6 or 0.7 ;-)


(The problem with quantitative logics is that there is usually no 
reference for the numbers you need there, so they are not well suited 
for a secondary data collection like Wikidata that relies on other 
sources. The closest concept that still might work is probabilistic 
logic, since you can really get some probabilities from published data; 
but even there it is hard to use the probability as a raw value without 
specifying very clearly what the experiment looked like.)


Markus




2014-05-29 1:48 GMT+02:00 David Cuenca dacu...@gmail.com
mailto:dacu...@gmail.com:

Markus,


On Thu, May 29, 2014 at 12:53 AM, Markus Krötzsch
mar...@semantic-mediawiki.org
mailto:mar...@semantic-mediawiki.org wrote:

This is an easy question once you have been clear about what
human behaviour is. According to enwiki, it is a range of
behaviours *exhibited by* humans.


Settled :) Let's leave it at defined as a trait of

What would anybody do with this data? In what application could
it be of interest?


Well, our goal it to gather the whole human knowledge, not to use
it. I can think of several applications, but let's leave that open.
Never underestimate human creativity ;-)


Moreover, as a great Icelandic ontologist once said: There is
definitely, definitely, definitely no logic, to human behaviour ;-)


Definitely, that is why we spend so much time in front of flickering
squares making them flicker even more. It makes total sense :P

I think constraints are already understood in this way. The
name comes from databases, where a constraint violation is
indeed a rather hard error. On the other hand, ironically,
constraints (as a technical term) are often considered to be a
softer form of modelling than (onto)logical axioms: a constraint
can be violated while a logical axiom (as the name suggests) is
always true -- if it is not backed by the given data, new data
will be inferred. So as a technical term, constraint is quite
appropriate for the mechanism we have, although it may not be
the best term to clarify the intention.


Ok, I will not fight traditional labels nor conventions. I was
interested in pointing out to the inappropriateness of using a word
inside our community with a definition that doesn't matches its use,
when there is another word that matches perfectly and conveys its
meaning better to users.

Some important ideas like classification (instance of/subclass
of) belong completely to the analytical realm. We don't observe
classes, we define them. A planet is what we call a planet,
and this can change even if the actual lumps in space are pretty
much the same.


Agreed. Better labels could be defined as instance of/defined as
subclass of

Now inferences are slightly different. If we know that X implies
Y, then if A says X we can infer that (implicitly) A says Y.
That is a logical relationship (or rule) on the level of what is
claimed, rather than on the level of statements. Note that we
still need to have a way to find out that X implies Y, which
is a content-level claim that should have its own reference
somewhere. We mainly use inference in this sense with subclass
of in reasonator or when checking constraints. In this case,
the implications are encoded as subclass-of statements (If X is
a piano, then X is an instrument). This allows us to have
references on the implications.


Nope, nope, nope. I was not referring to hard implications, but to
heuristic ones.

Consider that these properties in the item namespace:
defined as a trait of
defined as having
defined as instance of

Would translate as these constraints in the property namespace:
likely to be a trait of
likely to have
likely to be an instance of


In general, an interesting question here is what the status of
subclass of really is. Do we gather this information from
external sources (surely there must be a book that tells us that
pianos are instruments) or do we as a community define this for
Wikidata (surely, the overall hierarchy we get is hardly the
universal class hierarchy of the world but a very specific
classification that is different from other classifications that
may exist elsewhere)? Best not to think about it too much and to
gather sources whenever 

Re: [Wikidata-l] What is the point of properties?

2014-05-29 Thread Thomas Douillard
hehe, maybe some kind inferences can lead to a good heuristic to suggest
properties and values in the entity suggester. As they naturally become
softer and softer by combination of uncertainties, this could also
provide some kind of limits for inferences by fixing a probability below
which we don't add a fuzzy fact to the set of facts.

Maybe we could fix an heuristic starting fuzziness or probability score
based on  1 sourced claim - big score ; one disputed claim ; based on
ranks and so on.


2014-05-29 13:43 GMT+02:00 Markus Krötzsch mar...@semantic-mediawiki.org:

 On 29/05/14 12:41, Thomas Douillard wrote:

 @David:
 I think you should have a look to fuzzy logic
 https://www.wikidata.org/wiki/Q224821:)


 Or at probabilistic logic, possibilistic logic, epistemic logic, ... it's
 endless. Let's first complete the data we are sure of before we start to
 discuss whether Pluto is a planet with fuzzy degree 0.6 or 0.7 ;-)

 (The problem with quantitative logics is that there is usually no
 reference for the numbers you need there, so they are not well suited for a
 secondary data collection like Wikidata that relies on other sources. The
 closest concept that still might work is probabilistic logic, since you can
 really get some probabilities from published data; but even there it is
 hard to use the probability as a raw value without specifying very clearly
 what the experiment looked like.)

 Markus



 2014-05-29 1:48 GMT+02:00 David Cuenca dacu...@gmail.com
 mailto:dacu...@gmail.com:


 Markus,


 On Thu, May 29, 2014 at 12:53 AM, Markus Krötzsch
 mar...@semantic-mediawiki.org
 mailto:mar...@semantic-mediawiki.org wrote:

 This is an easy question once you have been clear about what
 human behaviour is. According to enwiki, it is a range of
 behaviours *exhibited by* humans.


 Settled :) Let's leave it at defined as a trait of

 What would anybody do with this data? In what application could
 it be of interest?


 Well, our goal it to gather the whole human knowledge, not to use
 it. I can think of several applications, but let's leave that open.
 Never underestimate human creativity ;-)


 Moreover, as a great Icelandic ontologist once said: There is
 definitely, definitely, definitely no logic, to human behaviour
 ;-)


 Definitely, that is why we spend so much time in front of flickering
 squares making them flicker even more. It makes total sense :P

 I think constraints are already understood in this way. The
 name comes from databases, where a constraint violation is
 indeed a rather hard error. On the other hand, ironically,
 constraints (as a technical term) are often considered to be a
 softer form of modelling than (onto)logical axioms: a constraint
 can be violated while a logical axiom (as the name suggests) is
 always true -- if it is not backed by the given data, new data
 will be inferred. So as a technical term, constraint is quite
 appropriate for the mechanism we have, although it may not be
 the best term to clarify the intention.


 Ok, I will not fight traditional labels nor conventions. I was
 interested in pointing out to the inappropriateness of using a word
 inside our community with a definition that doesn't matches its use,
 when there is another word that matches perfectly and conveys its
 meaning better to users.

 Some important ideas like classification (instance of/subclass
 of) belong completely to the analytical realm. We don't observe
 classes, we define them. A planet is what we call a planet,
 and this can change even if the actual lumps in space are pretty
 much the same.


 Agreed. Better labels could be defined as instance of/defined as
 subclass of

 Now inferences are slightly different. If we know that X implies
 Y, then if A says X we can infer that (implicitly) A says Y.
 That is a logical relationship (or rule) on the level of what is
 claimed, rather than on the level of statements. Note that we
 still need to have a way to find out that X implies Y, which
 is a content-level claim that should have its own reference
 somewhere. We mainly use inference in this sense with subclass
 of in reasonator or when checking constraints. In this case,
 the implications are encoded as subclass-of statements (If X is
 a piano, then X is an instrument). This allows us to have
 references on the implications.


 Nope, nope, nope. I was not referring to hard implications, but to
 heuristic ones.

 Consider that these properties in the item namespace:
 defined as a trait of
 defined as having
 defined as instance of

 Would translate as these constraints in the property namespace:
 likely to be a trait of
 

Re: [Wikidata-l] What is the point of properties?

2014-05-29 Thread Markus Krötzsch

On 29/05/14 13:53, Thomas Douillard wrote:

hehe, maybe some kind inferences can lead to a good heuristic to suggest
properties and values in the entity suggester. As they naturally become
softer and softer by combination of uncertainties, this could also
provide some kind of limits for inferences by fixing a probability below
which we don't add a fuzzy fact to the set of facts.

Maybe we could fix an heuristic starting fuzziness or probability score
based on  1 sourced claim - big score ; one disputed claim ; based on
ranks and so on.


Sorry, I have to expand on this a bit ...

My main point was that there are many fuzzy logics (depending on the 
t-norm you chose) and many probabilistic logics (depending on the 
stochastic assumptions you make). The meaning of a score crucially 
depends on which logic you are in. Moreover, at least in fuzzy logic, 
the scores only are relevant in comparison to other scores (there is no 
absolute meaning to 0.3) -- therefore you need to ensure that the 
scores are assigned in a globally consistent way (0.3 in Wikidata would 
have to mean exactly the same wherever it is used).


This makes it extremely hard to implement such an approach in practice 
in a large, distributed knowledge base like ours. What's more, you 
cannot find these scores in books or newspapers, so you somehow have to 
make them up in another way. You suggested to use this for statements 
that are not generally accepted, but how do you measure how disputed a 
statement is? If two thirds of references are for it and the rest is 
against it, do you assign 0.66 as a score? It's very tricky.


Fuzzy logic has its main use in fuzzy control (the famous washing 
machine example), which is completely different and largely unrelated 
to fuzzy knowledge representation. In knowledge representation, fuzzy 
approaches are also studied, but their application is usually in a 
closed system (e.g., if you have one system that extracts data from a 
text and assigns certainties to all extracted facts in the same way). 
It's still unclear how to choose the right logic, but at least it will 
give you a uniform treatment of your data according to some fixed 
principles (whether they make sense or not).


The situation is much clearer in probabilistic logics, where you define 
your assumptions first (e.g., you assume that events are independent or 
that dependencies are captured in some specific way). This makes it more 
rigorous, but also harder to apply, since in practice these assumptions 
rarely hold. This is somewhat tolerable if you have a rather uniform 
data set (e.g., a lot of sensor measurements that give you some 
probability for actual states of the underlying system). But if you have 
a huge, open, cross-domain system like Wikidata, it would be almost 
impossible to force it into a particular probability framework where 
0.3 really means in 30% of all cases.


Also note that scientific probability is always a limit of observed 
frequencies. It says: if you do something again and again, this is the 
rate you will get. Often-heard statements like We have an 80% chance to 
succeed! or Chances are almost zero that the Earth will blow up 
tomorrow! are scientifically pointless, since you cannot repeat the 
experiments that they claim to make statements about. Many things we 
have in Wikidata are much more on the level of such general statements 
than on the level that you normally use probability for (good example of 
a proper use of probability: based the tests that we did so far, this 
patient has a 35% chance of having cancer -- these are not the things 
we normally have in Wikidata).


Markus




2014-05-29 13:43 GMT+02:00 Markus Krötzsch
mar...@semantic-mediawiki.org mailto:mar...@semantic-mediawiki.org:

On 29/05/14 12:41, Thomas Douillard wrote:

@David:
I think you should have a look to fuzzy logic
https://www.wikidata.org/__wiki/Q224821
https://www.wikidata.org/wiki/Q224821:)


Or at probabilistic logic, possibilistic logic, epistemic logic, ...
it's endless. Let's first complete the data we are sure of before we
start to discuss whether Pluto is a planet with fuzzy degree 0.6 or
0.7 ;-)

(The problem with quantitative logics is that there is usually no
reference for the numbers you need there, so they are not well
suited for a secondary data collection like Wikidata that relies on
other sources. The closest concept that still might work is
probabilistic logic, since you can really get some probabilities
from published data; but even there it is hard to use the
probability as a raw value without specifying very clearly what the
experiment looked like.)

Markus




___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-29 Thread Markus Krötzsch

The other answers, under the original subject:

On 29/05/14 01:48, David Cuenca wrote:


Settled :) Let's leave it at defined as a trait of


I don't think it is very clear what the intention of this property is. 
What are the limits of its use? What is it meant to do? Can behaviour 
really be a trait of a species? If we allow it here, it seems to apply 
to all kinds of connections: density/car? eternity/time? time/reality? 
evil/devil? rigour/science? -- this is opening a can of worms. It will 
be hard to maintain this.


Wikiuser13 recently added consists of: Neptune to Q1. It was fixed. 
But it is a good example of the kind of confusion that comes from such 
general ontological (in the philosophical sense) properties. And 
consists of is still very simple compared to defined as a trait of. 
Can't we focus on more obvious things like has social network account 
for a while? ;-)


...

Some important ideas like classification (instance of/subclass of)
belong completely to the analytical realm. We don't observe classes,
we define them. A planet is what we call a planet, and this can
change even if the actual lumps in space are pretty much the same.


Agreed. Better labels could be defined as instance of/defined as
subclass of


I don't think this is better. The short names are fine. As I explained 
in my email, Wikidata statements are mainly about what the external 
references say. The distinction between defined and observed is not 
on the surface of this. The main question is Did the reference say that 
pianos are instruments? but not Did the reference say pianos are 
instruments because of the definition of 'piano'? Therefore, we don't 
need to put this information in our labels.




Now inferences are slightly different. If we know that X implies Y,
then if A says X we can infer that (implicitly) A says Y. That
is a logical relationship (or rule) on the level of what is claimed,
rather than on the level of statements. Note that we still need to
have a way to find out that X implies Y, which is a content-level
claim that should have its own reference somewhere. We mainly use
inference in this sense with subclass of in reasonator or when
checking constraints. In this case, the implications are encoded as
subclass-of statements (If X is a piano, then X is an instrument).
This allows us to have references on the implications.


Nope, nope, nope. I was not referring to hard implications, but to
heuristic ones.

Consider that these properties in the item namespace:
defined as a trait of
defined as having
defined as instance of

Would translate as these constraints in the property namespace:
likely to be a trait of
likely to have
likely to be an instance of


I think you might have misunderstood my email. I was arguing *in favour* 
of soft constraints, but in the paragraph before the one about 
inferences that you reply to here. Inferences are hard ways for 
obtaining new knowledge from our own definitions. Example:


If X is the father of Y according to reference A
Then Y is the child of X according to reference A

This is as hard as it can get. We are absolutely sure of this since this 
rule just explains the relationship between two different ways we have 
for encoding family relationships.


Below, you said expectations inferred from definitions should not be 
treated as hard constraints -- maybe this mixture of terms indicates 
that I have not been clear enough about the distinction between 
inference and constraint. They are really completely different ways 
of looking at things. Inferences are something that adds (inevitable) 
conclusions to your knowledge, while constraints just tell you what to 
check for. If you accept the premises of an inference and the inference 
rule, then you must also accept the conclusion -- there is no soft way 
of reading this. To make it soft, you can start to formalise softness 
in your knowledge, using fuzzy logic or whatnot (see my other email with 
Thomas).


I don't think we can use soft inferences (in the sense of fuzzy logic 
et al.) but I am in favour of soft constraints (in the sense of your 
expectations). I guess we agree on all of this, but have a bit of 
trouble in making ourselves clear :-) But it is rather subtle material 
after all.





In general, an interesting question here is what the status of
subclass of really is. Do we gather this information from external
sources (surely there must be a book that tells us that pianos are
instruments) or do we as a community define this for Wikidata
(surely, the overall hierarchy we get is hardly the universal class
hierarchy of the world but a very specific classification that is
different from other classifications that may exist elsewhere)? Best
not to think about it too much and to gather sources whenever we
have them ;-)


I think it is good to think about it and to consider options to deal
with it. Like for instance:

Re: [Wikidata-l] What is the point of properties?

2014-05-29 Thread Andrew Gray
One other issue to bear in mind: it's *simple* to have properties as a
separate thing. I have been following this discussion with some
interest but... well, I don't think I'm particularly stupid, but most
of it is completely above my head.

Saying here are items, here are a set of properties you can define
relating to them, here's some notes on how to use properties is going
to get a lot more people able to contribute than if they need to start
understanding theoretical aspects of semantic relationships...

;-)

Andrew.

On 28 May 2014 09:37, Daniel Kinzler daniel.kinz...@wikimedia.de wrote:
 Key differences between Properties and Items:

 * Properties have a data type, items don't.
 * Items have sitelinks, Properties don't.
 * Items have Statements, Properties will support Claims (without sources).

 The software needs these constraints/guarantees to be able to take shortcuts,
 provide specialized UI and API functionality, etc.

 Yes, it would be possible to use items as properties instead of having a
 separate entity type. But they are structurally and functionally different, so
 it makes sense to have a strict separate. This makes a lot of things easier, 
 e.g.:

 * setting different permissions for properties
 * mapping to rdf vocabularies

 More fundamentally, they are semantically different: an item describes a 
 concept
 in the real world, while a property is a structural component used for such 
 a
 description.

 Yes, properies are simmilar to data items, and in some cases, there may be an
 item representing the same concept that is represented by a property entity. I
 don't see why that is a problem, while I can see a lot of confusion arising 
 from
 mixing them.

 -- daniel


 Am 28.05.2014 09:25, schrieb David Cuenca:
 Since the very beginning I have kept myself busy with properties, thinking 
 about
 which ones fit, which ones are missing to better describe reality, how 
 integrate
 into the ones that we have. The thing is that the more I work with them, the
 less difference I see with normal items and if soon there will be 
 statements
 allowed in property pages, the difference will blur even more.
 I can understand that from the software development point of view it might 
 make
 sense to have a clear difference. Or for the community to get a deeper
 understanding of the underlying concepts represented by words.

 But semantically I see no difference between:
 cement (Q45190) emissivity (P1295) 0.54
 and
 cement (Q45190) emissivity (Q899670) 0.54

 Am I missing something here? Are properties really needed or are we adding
 unnecessary artificial constraints?

 Cheers,
 Micru


 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 --
 Daniel Kinzler
 Senior Software Developer

 Wikimedia Deutschland
 Gesellschaft zur Förderung Freien Wissens e.V.

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l



-- 
- Andrew Gray
  andrew.g...@dunelm.org.uk

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-29 Thread Thomas Douillard
Héhé, the Wikidata game suggest it may be a little bit too complicated and
better abstracted away by a three button game for mass contribution :)


2014-05-29 21:04 GMT+02:00 Andrew Gray andrew.g...@dunelm.org.uk:

 One other issue to bear in mind: it's *simple* to have properties as a
 separate thing. I have been following this discussion with some
 interest but... well, I don't think I'm particularly stupid, but most
 of it is completely above my head.

 Saying here are items, here are a set of properties you can define
 relating to them, here's some notes on how to use properties is going
 to get a lot more people able to contribute than if they need to start
 understanding theoretical aspects of semantic relationships...

 ;-)

 Andrew.

 On 28 May 2014 09:37, Daniel Kinzler daniel.kinz...@wikimedia.de wrote:
  Key differences between Properties and Items:
 
  * Properties have a data type, items don't.
  * Items have sitelinks, Properties don't.
  * Items have Statements, Properties will support Claims (without
 sources).
 
  The software needs these constraints/guarantees to be able to take
 shortcuts,
  provide specialized UI and API functionality, etc.
 
  Yes, it would be possible to use items as properties instead of having a
  separate entity type. But they are structurally and functionally
 different, so
  it makes sense to have a strict separate. This makes a lot of things
 easier, e.g.:
 
  * setting different permissions for properties
  * mapping to rdf vocabularies
 
  More fundamentally, they are semantically different: an item describes a
 concept
  in the real world, while a property is a structural component used for
 such a
  description.
 
  Yes, properies are simmilar to data items, and in some cases, there may
 be an
  item representing the same concept that is represented by a property
 entity. I
  don't see why that is a problem, while I can see a lot of confusion
 arising from
  mixing them.
 
  -- daniel
 
 
  Am 28.05.2014 09:25, schrieb David Cuenca:
  Since the very beginning I have kept myself busy with properties,
 thinking about
  which ones fit, which ones are missing to better describe reality, how
 integrate
  into the ones that we have. The thing is that the more I work with
 them, the
  less difference I see with normal items and if soon there will be
 statements
  allowed in property pages, the difference will blur even more.
  I can understand that from the software development point of view it
 might make
  sense to have a clear difference. Or for the community to get a deeper
  understanding of the underlying concepts represented by words.
 
  But semantically I see no difference between:
  cement (Q45190) emissivity (P1295) 0.54
  and
  cement (Q45190) emissivity (Q899670) 0.54
 
  Am I missing something here? Are properties really needed or are we
 adding
  unnecessary artificial constraints?
 
  Cheers,
  Micru
 
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 
 
 
  --
  Daniel Kinzler
  Senior Software Developer
 
  Wikimedia Deutschland
  Gesellschaft zur Förderung Freien Wissens e.V.
 
  ___
  Wikidata-l mailing list
  Wikidata-l@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/wikidata-l



 --
 - Andrew Gray
   andrew.g...@dunelm.org.uk

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Gerard Meijssen
Hoi,
In OmegaWiki we made the choice that any defined meaning can be used as a
property. This makes OmegaWiki more like a Wiki than Wikidata were
properties have to be created by fiat. What was found is that people tend
to not abuse this and there is a limited set that is used as properties.

When you do not insist on the artificial limits implicit in properties,
there will be one victim; it is the structure of the ontology. However when
you analyse things, such a structure still exists it is just no longer
formal. In a way it is similar to the early insistence on using the GND
types, they did not fit but thankfully we kept the GND identifier in
this way we left the structure of GND where it belonged; in GND itself.
They can map to their hearts content our content using their structure.

One final thought, when we have enough data, we can manipulate it. Because
of a lack of data we are still left with many GND types.

PS there is nothing wrong in leaving things as they are.. It works more or
less.
Thanks,
  Geard


On 28 May 2014 09:25, David Cuenca dacu...@gmail.com wrote:

 Since the very beginning I have kept myself busy with properties, thinking
 about which ones fit, which ones are missing to better describe reality,
 how integrate into the ones that we have. The thing is that the more I work
 with them, the less difference I see with normal items and if soon
 there will be statements allowed in property pages, the difference will
 blur even more.
 I can understand that from the software development point of view it might
 make sense to have a clear difference. Or for the community to get a deeper
 understanding of the underlying concepts represented by words.

 But semantically I see no difference between:
 cement (Q45190) emissivity (P1295) 0.54
 and
 cement (Q45190) emissivity (Q899670) 0.54

  Am I missing something here? Are properties really needed or are we
 adding unnecessary artificial constraints?

 Cheers,
 Micru

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Daniel Kinzler
Key differences between Properties and Items:

* Properties have a data type, items don't.
* Items have sitelinks, Properties don't.
* Items have Statements, Properties will support Claims (without sources).

The software needs these constraints/guarantees to be able to take shortcuts,
provide specialized UI and API functionality, etc.

Yes, it would be possible to use items as properties instead of having a
separate entity type. But they are structurally and functionally different, so
it makes sense to have a strict separate. This makes a lot of things easier, 
e.g.:

* setting different permissions for properties
* mapping to rdf vocabularies

More fundamentally, they are semantically different: an item describes a concept
in the real world, while a property is a structural component used for such a
description.

Yes, properies are simmilar to data items, and in some cases, there may be an
item representing the same concept that is represented by a property entity. I
don't see why that is a problem, while I can see a lot of confusion arising from
mixing them.

-- daniel


Am 28.05.2014 09:25, schrieb David Cuenca:
 Since the very beginning I have kept myself busy with properties, thinking 
 about
 which ones fit, which ones are missing to better describe reality, how 
 integrate
 into the ones that we have. The thing is that the more I work with them, the
 less difference I see with normal items and if soon there will be 
 statements
 allowed in property pages, the difference will blur even more.
 I can understand that from the software development point of view it might 
 make
 sense to have a clear difference. Or for the community to get a deeper
 understanding of the underlying concepts represented by words.
 
 But semantically I see no difference between:
 cement (Q45190) emissivity (P1295) 0.54
 and
 cement (Q45190) emissivity (Q899670) 0.54
 
 Am I missing something here? Are properties really needed or are we adding
 unnecessary artificial constraints?
 
 Cheers,
 Micru
 
 
 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l
 


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Markus Krötzsch

Hi David,

Interesting remark. Let's explore this idea a bit. I will give you two 
main reasons why we have properties separate, one practical and one 
conceptual.


First the practical point. Certainly, everything that is used as a 
property needs to have a datatype, since otherwise the wiki would not 
know what kind of input UI to show. So you cannot use just any item as a 
property straight away -- it needs to have a datatype first. So, yes, 
you could abolish the namespace Property but you still would have a 
clear, crisp distinction between property items (those with datatype) 
and normal items (those without a datatype). Because of this, most of 
the other functions would work the same as before (for example, property 
autocompletion would still only show properties, not arbitrary items).


A complication with this approach is that property datatypes cannot 
change in Wikibase. This design was picked since there is no way to 
convert existing data from one datatype to another in general. So 
changing the datatype would create problems by making a lot of data 
invalid, and require special handling and special UI to handle this 
situation. With properties living in a separate namespace, this is not a 
real restriction: you can just create a new property and give it the 
same label (after naming the old one differently, e.g., putting 
DEPRECATED in its name). Then you can migrate the data in some custom 
fashion. But if properties would be items, we would have a problem here: 
the item is already linked to many Wikipedias and other projects, and it 
might be used in LUA scripts, queries, or even external applications 
like Denny's Javascript translation library. You cannot change item ids 
easily. Also, many items would not have a datatype, so the first one who 
(accidentally?) is entered will be fixed. So we would definitely need to 
rethink the whole idea of unchangeable datatypes.


My other important reason is conceptual. Properties are not considered 
part of the (encyclopaedic) data but rather part of the schema that the 
community has picked to organise that data. As in your example, 
emissivity (Q899670) is a notion in physics as described in a 
Wikipedia article. There are many things to say about this notion (for 
example, it has a history: somebody must have defined this first -- 
although Wikipedia does not say it in this case). As in all cases, some 
statements might be disputed while others are widely acknowledged to be 
true.


For the property emissivity (P1295), the situation is quite different. 
It was introduced as an element used to enter data, similar to a row in 
a database table or an infobox template in some Wikipedia. It does 
probably closely relate to the actual physical notion Q899670, but it 
still is a different thing. For example, it was first introduced by 
User:Jakec, who is probably not the person who introduced the physical 
concept ;-) Anything that we will say about P1295 in the future refers 
to the property -- a concept of our own making, that is not described in 
any external source (there are no publications discussing P1295).


This is also the reason why properties are supposed to support *claims* 
not *statements*. That is, they will have property-value pairs and 
qualifiers, but no references or ranks. Indeed, anything we say about 
properties has the status of a definition. If we say it, it's true. 
There is no other authority on Wikidata properties. You could of course 
still have items and properties share a page and somehow define which 
statements/claims refer to which concept, but this does not seem to make 
things easier for users.


These are, for me, the two main reasons why it makes sense to keep 
properties apart from items on a technical level. Besides this, it is 
also convenient to separate the 1000-something properties from the 
15-million something items for reasons of maintenance.


Best regards,

Markus


On 28/05/14 09:25, David Cuenca wrote:

Since the very beginning I have kept myself busy with properties,
thinking about which ones fit, which ones are missing to better describe
reality, how integrate into the ones that we have. The thing is that the
more I work with them, the less difference I see with normal items
and if soon there will be statements allowed in property pages, the
difference will blur even more.
I can understand that from the software development point of view it
might make sense to have a clear difference. Or for the community to get
a deeper understanding of the underlying concepts represented by words.

But semantically I see no difference between:
cement (Q45190) emissivity (P1295) 0.54
and
cement (Q45190) emissivity (Q899670) 0.54

Am I missing something here? Are properties really needed or are we
adding unnecessary artificial constraints?

Cheers,
Micru


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org

Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Markus Krötzsch

On 28/05/14 10:37, Daniel Kinzler wrote:

Key differences between Properties and Items:

* Properties have a data type, items don't.
* Items have sitelinks, Properties don't.
* Items have Statements, Properties will support Claims (without sources).

The software needs these constraints/guarantees to be able to take shortcuts,
provide specialized UI and API functionality, etc.

Yes, it would be possible to use items as properties instead of having a
separate entity type. But they are structurally and functionally different, so
it makes sense to have a strict separate. This makes a lot of things easier, 
e.g.:

* setting different permissions for properties
* mapping to rdf vocabularies


This one point requires a tiny remark: there is no problem in OWL or RDF 
to use the same URI as a property, an individual, and a class in 
different contexts. The only thing that OWL (DL) forbids is to use one 
property for literal values (like string) and for object values (like 
other items), but this would not occur in our case anyway since we have 
clearly defined types. I completely agree with all the rest :-)


Cheers,

Markus



More fundamentally, they are semantically different: an item describes a concept
in the real world, while a property is a structural component used for such a
description.

Yes, properies are simmilar to data items, and in some cases, there may be an
item representing the same concept that is represented by a property entity. I
don't see why that is a problem, while I can see a lot of confusion arising from
mixing them.

-- daniel


Am 28.05.2014 09:25, schrieb David Cuenca:

Since the very beginning I have kept myself busy with properties, thinking about
which ones fit, which ones are missing to better describe reality, how integrate
into the ones that we have. The thing is that the more I work with them, the
less difference I see with normal items and if soon there will be statements
allowed in property pages, the difference will blur even more.
I can understand that from the software development point of view it might make
sense to have a clear difference. Or for the community to get a deeper
understanding of the underlying concepts represented by words.

But semantically I see no difference between:
cement (Q45190) emissivity (P1295) 0.54
and
cement (Q45190) emissivity (Q899670) 0.54

Am I missing something here? Are properties really needed or are we adding
unnecessary artificial constraints?

Cheers,
Micru


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l







___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread David Cuenca
On Wed, May 28, 2014 at 10:37 AM, Daniel Kinzler 
daniel.kinz...@wikimedia.de wrote:

 More fundamentally, they are semantically different: an item describes a
 concept
 in the real world, while a property is a structural component used for
 such a
 description.


As I perceive it, a property is a normal item (concept) imbued with the
option to use it as predicate and allow it to use different datatypes.
There is no property that cannot be expressed as an item, even properties
that represent an identifier, they also could be said that they are a
concept in the real world.
I understand that from the software side you need to make a difference
between basic concepts (items) and concepts that can be used as
predicates (properties). From the community side we also need to
scrutinize and rinse the concepts that hide behind the words before using
them as predicates, but sometimes it is good to stop and consider what are
we really doing.


Yes, properies are simmilar to data items, and in some cases, there may be
 an
 item representing the same concept that is represented by a property
 entity.


I haven't found yet a property that couldn't be expressed as an item.


 I don't see why that is a problem, while I can see a lot of confusion
 arising from
 mixing them.


 It is not a problem now but I considered interesting to analyze what is
the substance of the distinction. If properties and concepts are separate
in the end we will be reproducing their ontological structure when
organizing them. So then it might not make sense to use subproperty of to
organize properties, but just corresponds to item.

Gerard, thanks for bringing the example of OmegaWiki, it is interesting
that two independent communities came to the same thoughts without any
contact between them :)

Cheers,
Micru
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Daniel Kinzler
Am 28.05.2014 11:44, schrieb Markus Krötzsch:
 This one point requires a tiny remark: there is no problem in OWL or RDF to 
 use
 the same URI as a property, an individual, and a class in different contexts.
 The only thing that OWL (DL) forbids is to use one property for literal values
 (like string) and for object values (like other items), but this would not 
 occur
 in our case anyway since we have clearly defined types. I completely agree 
 with
 all the rest :-)

Yea, I didn't mean to say that there is an issue with representing this in RDF,
but with mapping to RDF vocabularies. Having a relatively limited and stable set
of properties to map makes that a lot easier.

-- daniel


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread David Cuenca
Markus,
The explanation about the implications of renaming/deleting makes most
sense and just that justifies already the separation in two.
It is equally true that when we create a property, we might have cleaned
the original concept so much that it might differ (even slightly) with the
understood concept that the item represents. However, even after that
process, the new concept is still an item...

The process of imbuing a concept with permanent characteristics (adding a
datatype) and the practical approach, also seems to recommend keeping items
and properties separate.
Thanks for showing me that reasoning :)

I am still wondering about how are we going to classify properties. Maybe
it will require a broader discussion, but if they are the same (or mostly
the same) as items, then we can just link them as same as, and build the
classing structure just for the items. OTOH, if they are different, then we
will need to mirror that classification for properties, which seems quite
redundant. Plus adding a new datatype, property.

All in all, my conclusion about this is that properties are just concepts
with special qualities that justify the separation in the software (even if
in real life there is no separation).

many thanks for your detailed answer, and sorry if I'm bringing up already
discussed topics. It is just that when you stare long into wikidata,
wikidata stares back into you ;)

Cheers,
Micru


On Wed, May 28, 2014 at 11:39 AM, Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 Hi David,

 Interesting remark. Let's explore this idea a bit. I will give you two
 main reasons why we have properties separate, one practical and one
 conceptual.

 First the practical point. Certainly, everything that is used as a
 property needs to have a datatype, since otherwise the wiki would not know
 what kind of input UI to show. So you cannot use just any item as a
 property straight away -- it needs to have a datatype first. So, yes, you
 could abolish the namespace Property but you still would have a clear,
 crisp distinction between property items (those with datatype) and normal
 items (those without a datatype). Because of this, most of the other
 functions would work the same as before (for example, property
 autocompletion would still only show properties, not arbitrary items).

 A complication with this approach is that property datatypes cannot change
 in Wikibase. This design was picked since there is no way to convert
 existing data from one datatype to another in general. So changing the
 datatype would create problems by making a lot of data invalid, and
 require special handling and special UI to handle this situation. With
 properties living in a separate namespace, this is not a real restriction:
 you can just create a new property and give it the same label (after naming
 the old one differently, e.g., putting DEPRECATED in its name). Then you
 can migrate the data in some custom fashion. But if properties would be
 items, we would have a problem here: the item is already linked to many
 Wikipedias and other projects, and it might be used in LUA scripts,
 queries, or even external applications like Denny's Javascript translation
 library. You cannot change item ids easily. Also, many items would not have
 a datatype, so the first one who (accidentally?) is entered will be fixed.
 So we would definitely need to rethink the whole idea of unchangeable
 datatypes.

 My other important reason is conceptual. Properties are not considered
 part of the (encyclopaedic) data but rather part of the schema that the
 community has picked to organise that data. As in your example,
 emissivity (Q899670) is a notion in physics as described in a Wikipedia
 article. There are many things to say about this notion (for example, it
 has a history: somebody must have defined this first -- although Wikipedia
 does not say it in this case). As in all cases, some statements might be
 disputed while others are widely acknowledged to be true.

 For the property emissivity (P1295), the situation is quite different.
 It was introduced as an element used to enter data, similar to a row in a
 database table or an infobox template in some Wikipedia. It does probably
 closely relate to the actual physical notion Q899670, but it still is a
 different thing. For example, it was first introduced by User:Jakec, who is
 probably not the person who introduced the physical concept ;-) Anything
 that we will say about P1295 in the future refers to the property -- a
 concept of our own making, that is not described in any external source
 (there are no publications discussing P1295).

 This is also the reason why properties are supposed to support *claims*
 not *statements*. That is, they will have property-value pairs and
 qualifiers, but no references or ranks. Indeed, anything we say about
 properties has the status of a definition. If we say it, it's true. There
 is no other authority on Wikidata properties. You could of course 

Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Markus Krötzsch

David,

Regarding the question of how to classify properties and how to relate 
them to items:


* same as (in the sense of owl:sameAs) is not the right concept here. 
In fact, it has often been discouraged to use this on the Web, since it 
has very strong implications: it means that in all uses of the one 
identifier, one could just as well use the other identifier, and that it 
is indistinguishable if something has been said about the one or the 
other. That seems too strong here, at least for most cases.


* In the world of OWL DL, sameAs specifically refers to individuals, not 
to classes or properties. Saying P sameAs Q does not imply that P and 
Q have the same extension as properties. For the latter, OWL has the 
relationship owl:equivalentProperties. This distinction of instance 
level and schema level is similar to the distinction we have between 
instance of and subclass of.


* Therefore, I would suggest to use a property called subproperty of 
as one way of relating properties (analogously to subclass of). It has 
to be checked if this actually occurs in Wikidata (do we have any 
properties that would be in this relation, or do we make it a modelling 
principle to have only the most specific properties in Wikidata?).


* The relationship from properties to items could be modelled with the 
existing property subject of (P805).


* It might be useful to also have a taxonomic classification of 
properties. For example, we already group properties into properties for 
people, organisations, etc. Such information could also be added 
with a specific property (this would be a bit more like a category 
system on property pages). On the other hand, some of this might 
coincide with constraint information that could be expressed as claims. 
For instance, person properties might be those with Type (i.e., 
rdfs:domain) constraint human. By the way, our constraint system could 
use some systematisation -- there are many overlaps in what you can do 
with one constraint or another.


Cheers,

Markus

On 28/05/14 12:14, David Cuenca wrote:

Markus,
The explanation about the implications of renaming/deleting makes most
sense and just that justifies already the separation in two.
It is equally true that when we create a property, we might have
cleaned the original concept so much that it might differ (even
slightly) with the understood concept that the item represents. However,
even after that process, the new concept is still an item...

The process of imbuing a concept with permanent characteristics (adding
a datatype) and the practical approach, also seems to recommend keeping
items and properties separate.
Thanks for showing me that reasoning :)

I am still wondering about how are we going to classify properties.
Maybe it will require a broader discussion, but if they are the same (or
mostly the same) as items, then we can just link them as same as, and
build the classing structure just for the items. OTOH, if they are
different, then we will need to mirror that classification for
properties, which seems quite redundant. Plus adding a new datatype,
property.

All in all, my conclusion about this is that properties are just
concepts with special qualities that justify the separation in the
software (even if in real life there is no separation).

many thanks for your detailed answer, and sorry if I'm bringing up
already discussed topics. It is just that when you stare long into
wikidata, wikidata stares back into you ;)

Cheers,
Micru


On Wed, May 28, 2014 at 11:39 AM, Markus Krötzsch
mar...@semantic-mediawiki.org mailto:mar...@semantic-mediawiki.org
wrote:

Hi David,

Interesting remark. Let's explore this idea a bit. I will give you
two main reasons why we have properties separate, one practical and
one conceptual.

First the practical point. Certainly, everything that is used as a
property needs to have a datatype, since otherwise the wiki would
not know what kind of input UI to show. So you cannot use just any
item as a property straight away -- it needs to have a datatype
first. So, yes, you could abolish the namespace Property but you
still would have a clear, crisp distinction between property items
(those with datatype) and normal items (those without a datatype).
Because of this, most of the other functions would work the same as
before (for example, property autocompletion would still only show
properties, not arbitrary items).

A complication with this approach is that property datatypes cannot
change in Wikibase. This design was picked since there is no way to
convert existing data from one datatype to another in general. So
changing the datatype would create problems by making a lot of data
invalid, and require special handling and special UI to handle
this situation. With properties living in a separate namespace, this
is not a real restriction: you can just create a new property and
give it the same label 

Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Lydia Pintscher
On Wed, May 28, 2014 at 2:48 PM, Markus Krötzsch
mar...@semantic-mediawiki.org wrote:
 David,

 Regarding the question of how to classify properties and how to relate them
 to items:

 * same as (in the sense of owl:sameAs) is not the right concept here. In
 fact, it has often been discouraged to use this on the Web, since it has
 very strong implications: it means that in all uses of the one identifier,
 one could just as well use the other identifier, and that it is
 indistinguishable if something has been said about the one or the other.
 That seems too strong here, at least for most cases.

 * In the world of OWL DL, sameAs specifically refers to individuals, not to
 classes or properties. Saying P sameAs Q does not imply that P and Q have
 the same extension as properties. For the latter, OWL has the relationship
 owl:equivalentProperties. This distinction of instance level and schema
 level is similar to the distinction we have between instance of and
 subclass of.

 * Therefore, I would suggest to use a property called subproperty of as
 one way of relating properties (analogously to subclass of). It has to be
 checked if this actually occurs in Wikidata (do we have any properties that
 would be in this relation, or do we make it a modelling principle to have
 only the most specific properties in Wikidata?).

 * The relationship from properties to items could be modelled with the
 existing property subject of (P805).

 * It might be useful to also have a taxonomic classification of properties.
 For example, we already group properties into properties for people,
 organisations, etc. Such information could also be added with a specific
 property (this would be a bit more like a category system on property
 pages).

Yes. That's the way forward for now.

 On the other hand, some of this might coincide with constraint
 information that could be expressed as claims. For instance, person
 properties might be those with Type (i.e., rdfs:domain) constraint
 human. By the way, our constraint system could use some systematisation --
 there are many overlaps in what you can do with one constraint or another.

I hope to have a team of students work on improving constraints
reports and everything around it later in the year. It'll depend on if
they pick this project though.


Cheers
Lydia

-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread David Cuenca
Markus,

Ok, now I understand that same as wouldn't be a good name for the
confusion it would cause. However the property subject of as it is now
wouldn't be a good candidate either. Its meaning is that a certain
statement is represented by another item (that is why it is only allowed to
be used as qualifier).

Perhaps a better name would be corresponds with item and the inverse
corresponds with property. Just by having these connections, a lot of
information can be inferred from the connected item.

Consider the following example with occupation (P106), and occupation
(Q13516667):
- I cannot find any clear subproperty of for p106, but there is a clear
subclass of:human behaviour for the item
- human behaviour is part of human
- human can have a statement intrinsic property (property proposal
still under discussion) with values birthday (Q47223) and an (eventual)
date of death. It can be expanded in the future to include newly created
properties like height, weight, eye color, etc
- birthday (Q47223) corresponds with property date of birth (P569)

Out of this I reach the following conclusions:
- the taxonomy of properties is going to be weak, since there is not always
a clear subpropertyOf unless created artificially (more work)
- the standard taxonomy of items (subclass of/part of) is sufficient
to automatically reach meaningful constraints and inference (less work)
- by adding manually the constraints to the property itself we are
duplicating information which will require volunteer effort to maintain
(more work)

My recommendation is to rely mainly on the main taxonomy instead of
creating a parallel property taxonomy, and then think of ways to extract
information from the main taxonomy to convert it automatically into
constraints.
All the maintenance takes effort, so the more it can be automated, the more
efficient volunteers will be. And if we can simplify the maintenance of
properties, we will be able to simplify the creation of properties too,
specially when we face the next surge which will come with the datatype
number with units.

Cheers,
Micru



On Wed, May 28, 2014 at 2:48 PM, Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 David,

 Regarding the question of how to classify properties and how to relate
 them to items:

 * same as (in the sense of owl:sameAs) is not the right concept here. In
 fact, it has often been discouraged to use this on the Web, since it has
 very strong implications: it means that in all uses of the one identifier,
 one could just as well use the other identifier, and that it is
 indistinguishable if something has been said about the one or the other.
 That seems too strong here, at least for most cases.

 * In the world of OWL DL, sameAs specifically refers to individuals, not
 to classes or properties. Saying P sameAs Q does not imply that P and Q
 have the same extension as properties. For the latter, OWL has the
 relationship owl:equivalentProperties. This distinction of instance level
 and schema level is similar to the distinction we have between instance
 of and subclass of.

 * Therefore, I would suggest to use a property called subproperty of as
 one way of relating properties (analogously to subclass of). It has to be
 checked if this actually occurs in Wikidata (do we have any properties that
 would be in this relation, or do we make it a modelling principle to have
 only the most specific properties in Wikidata?).

 * The relationship from properties to items could be modelled with the
 existing property subject of (P805).

 * It might be useful to also have a taxonomic classification of
 properties. For example, we already group properties into properties for
 people, organisations, etc. Such information could also be added with a
 specific property (this would be a bit more like a category system on
 property pages). On the other hand, some of this might coincide with
 constraint information that could be expressed as claims. For instance,
 person properties might be those with Type (i.e., rdfs:domain)
 constraint human. By the way, our constraint system could use some
 systematisation -- there are many overlaps in what you can do with one
 constraint or another.

 Cheers,

 Markus


 On 28/05/14 12:14, David Cuenca wrote:

 Markus,
 The explanation about the implications of renaming/deleting makes most
 sense and just that justifies already the separation in two.
 It is equally true that when we create a property, we might have
 cleaned the original concept so much that it might differ (even
 slightly) with the understood concept that the item represents. However,
 even after that process, the new concept is still an item...

 The process of imbuing a concept with permanent characteristics (adding
 a datatype) and the practical approach, also seems to recommend keeping
 items and properties separate.
 Thanks for showing me that reasoning :)

 I am still wondering about how are we going to classify properties.
 Maybe it will require a broader 

Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Markus Krötzsch

David,

On 28/05/14 16:35, David Cuenca wrote:

Markus,

Ok, now I understand that same as wouldn't be a good name for the
confusion it would cause. However the property subject of as it is now
wouldn't be a good candidate either. Its meaning is that a certain
statement is represented by another item (that is why it is only allowed
to be used as qualifier).


Ok.



Perhaps a better name would be corresponds with item and the inverse
corresponds with property. Just by having these connections, a lot of
information can be inferred from the connected item.

Consider the following example with occupation (P106), and occupation
(Q13516667):
- I cannot find any clear subproperty of for p106, but there is a
clear subclass of:human behaviour for the item
- human behaviour is part of human


I don't understand this use of part of. Maybe I would say having an 
occupation is part of being human but not that occupation is part of 
human. I would not use either of these and restrict part of to clear, 
undisputed statements like the steering wheel is part of the car. 
Otherwise, anything could be part of human (head?, sadness?, 
singing?, birth? -- entering this in Wikidata would not lead anywhere).


Part of is quite problematic in general. You can see it from the 
discussion on its property page, and also from the uses it sees in the 
wiki, that this property is severely misunderstood and/or misused. At 
the very least, one should distinguish physical part of from meronym 
(both are aliases of the property now!). And then one should realise 
that meronyms are in the domain of Wiktionary, which we cannot capture 
in Wikidata properly since we do not have items for words but for 
concepts. One alias for an item might be a meronym of something else, 
while another alias for the same item is not. Using statements for 
linguistic properties in Wikidata will not be successful. I am not 
saying that Wikibase is not able to capture some ideas of a thesaurus 
(we have actually discussed this), but this is not how it is used in 
Wikidata.



- human can have a statement intrinsic property (property proposal
still under discussion) with values birthday (Q47223) and an
(eventual) date of death. It can be expanded in the future to include
newly created properties like height, weight, eye color, etc


Yes, this again makes sense to me. It is basically a variant of the 
constraint Item which allows you to say that items that are instance 
of human should also have a birthday. But again, this is schematic 
information (like constraints) and it should not be mixed up with actual 
data. It is the same conceptual difference that I have explained for 
properties vs. items earlier. Moreover, I think this information (even 
if correct in some sense) has very little utility as a piece of 
information about an item; it is much more useful for constraints about 
properties (which are not items).



- birthday (Q47223) corresponds with property date of birth (P569)


It should be the other way around: the correspondence says something 
about P569, not about Q47223. There cannot be any reference for this. It 
should therefore be a claim on the page of P569 rather than a statement 
on the page of Q47223.




Out of this I reach the following conclusions:
- the taxonomy of properties is going to be weak, since there is not
always a clear subpropertyOf unless created artificially (more work)


I agree.


- the standard taxonomy of items (subclass of/part of) is sufficient
to automatically reach meaningful constraints and inference (less work)


I agree that the taxonomy will be helpful in constraints. This is what 
constraints already do when using instance of/subclass of. However, I do 
not agree that the constraints can or should be stated as part of this 
taxonomy. Constraints are too complex, and they are conceptually 
different (they say how a property should be used, not how something in 
the Real World relates to something else). Constraints interact nicely 
with the taxonomy and help to get useful conclusions, but they are not 
part of taxonomy ;-). We must keep content organisation separate from 
content.



- by adding manually the constraints to the property itself we are
duplicating information which will require volunteer effort to maintain
(more work)


I disagree. Constraints refer to the property, not to the Wikidata item, 
and it would be conceptually wrong to mix these things up. We already 
have agreed that properties and items need to remain distinct for 
technical reasons. Once this is clear, there is no reason to move 
information that refers to properties (constraints) to item pages. This 
will not be a duplication of information: it is enough to have the 
constraints on the property pages only. If you look at the constraints 
we have, you can see many examples that are specific to Wikidata and 
certainly not a general thing about the concept (take the allowed 
values for sex or gender). We really want to keep editorial 

Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread David Cuenca
Markus,

I share your dissatisfaction with part of because that language construct
hides many different conceptual relationships that should be cleared out, I
think we'll have some community discussion work to do in that regard. One
of the uses is: what is the relationship between a human and his behavior?
I would say that the human has been defined as having human behavior
(or the reverse). But if you have a better suggestion to express this
concept I would be really glad to hear it.

Now that you mention it, yes, I agree that only a property called
corresponds with item makes sense in this context, but not the inverse.

I would like to make a further distinction regarding constraints. The
nature of constraints is not to set arbitrary limits but to reflect
patterns that naturally appear in concepts. On that regard, I hate the word
constraint, because it means that we are placing a straitjacket on
reality, when it is the other way round, recurring patterns in the real
world make us expect that a value will fall within the bonds of our
expectations.
I think that we should seriously consider using the term expectation from
now on because we don't constrain the values per se, we expect them to
have a value, and when the value departs from the expected value, then it
sets an alarm that might reflect an error or not.

Once made that distinction, yes, you are right, considering that we are
separating properties and items, our expectations do not belong to the data
itself, they belong to the property.

However, I would like to go to bring the conversation to a deeper level.
What is that what makes the concept of addition (Q32043) to be that? What
is in physical object (Q223557) that we, sentient beings, can perceive
and agree to treat as a concept? I mention those two because one is purely
abstract, and the other one is purely physical. And I would say that
addition (Q32043) has been defined as having associativity (Q177251)
and physical object (Q223557) has been repeatedly observed to have
density (Q29539). We can argue whether the second is an expectation or
not, but the first is definitely not, someone defined an addition like
that and this information can be sourced. Even more, we could also say that
also physical object (Q223557) has been defined as having density
(Q29539), and I guess we could find sources for that statement too.

With all this I want to make the point that there are two sources of
expectations:
- from our experience seeing repetitions and patterns in the values
(male/female/etc between 10 and 50), which belong to the property
- from the agreed definition of the concept itself, which belong to the data

Cheers,
Micru

PS: this is a re-post because my previous message was bounced back for
being too long :)
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Thomas Douillard
Hi, for the behavior, I would said a behavior may be linked to a
psychological trait.
I's say a behavior is defined by the person having a lot of acts belonging
to a typical class of events.

someone is said to be aggressive if typically when he acts as hostile in
many situations. I remember a theory about that :
https://en.wikipedia.org/wiki/Trait_theory :)


2014-05-28 20:46 GMT+02:00 David Cuenca dacu...@gmail.com:

 Markus,

 I share your dissatisfaction with part of because that language
 construct hides many different conceptual relationships that should be
 cleared out, I think we'll have some community discussion work to do in
 that regard. One of the uses is: what is the relationship between a human
 and his behavior?
 I would say that the human has been defined as having human behavior
 (or the reverse). But if you have a better suggestion to express this
 concept I would be really glad to hear it.

 Now that you mention it, yes, I agree that only a property called
 corresponds with item makes sense in this context, but not the inverse.

 I would like to make a further distinction regarding constraints. The
 nature of constraints is not to set arbitrary limits but to reflect
 patterns that naturally appear in concepts. On that regard, I hate the word
 constraint, because it means that we are placing a straitjacket on
 reality, when it is the other way round, recurring patterns in the real
 world make us expect that a value will fall within the bonds of our
 expectations.
 I think that we should seriously consider using the term expectation
 from now on because we don't constrain the values per se, we expect
 them to have a value, and when the value departs from the expected value,
 then it sets an alarm that might reflect an error or not.

 Once made that distinction, yes, you are right, considering that we are
 separating properties and items, our expectations do not belong to the data
 itself, they belong to the property.

 However, I would like to go to bring the conversation to a deeper level.
 What is that what makes the concept of addition (Q32043) to be that? What
 is in physical object (Q223557) that we, sentient beings, can perceive
 and agree to treat as a concept? I mention those two because one is purely
 abstract, and the other one is purely physical. And I would say that
 addition (Q32043) has been defined as having associativity (Q177251)
 and physical object (Q223557) has been repeatedly observed to have
 density (Q29539). We can argue whether the second is an expectation or
 not, but the first is definitely not, someone defined an addition like
 that and this information can be sourced. Even more, we could also say that
 also physical object (Q223557) has been defined as having density
 (Q29539), and I guess we could find sources for that statement too.

 With all this I want to make the point that there are two sources of
 expectations:
 - from our experience seeing repetitions and patterns in the values
 (male/female/etc between 10 and 50), which belong to the property
 - from the agreed definition of the concept itself, which belong to the
 data

 Cheers,
 Micru

 PS: this is a re-post because my previous message was bounced back for
 being too long :)

 ___
 Wikidata-l mailing list
 Wikidata-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata-l


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread Markus Krötzsch

David,


One of the uses is: what is the relationship between a
human and his behavior?


This is an easy question once you have been clear about what human 
behaviour is. According to enwiki, it is a range of behaviours 
*exhibited by* humans. The bigger question for me is, whether it is 
useful to record this relationship (exhibited by) in Wikidata. What 
would anybody do with this data? In what application could it be of 
interest?


Moreover, as a great Icelandic ontologist once said: There is 
definitely, definitely, definitely no logic, to human behaviour ;-)



On that regard, I hate the
word constraint, because it means that we are placing a straitjacket
on reality, when it is the other way round, recurring patterns in the
real world make us expect that a value will fall within the bonds of
our expectations.


I think constraints are already understood in this way. The name comes 
from databases, where a constraint violation is indeed a rather hard 
error. On the other hand, ironically, constraints (as a technical term) 
are often considered to be a softer form of modelling than (onto)logical 
axioms: a constraint can be violated while a logical axiom (as the name 
suggests) is always true -- if it is not backed by the given data, new 
data will be inferred. So as a technical term, constraint is quite 
appropriate for the mechanism we have, although it may not be the best 
term to clarify the intention.




However, I would like to go to bring the conversation to a deeper level.

...


With all this I want to make the point that there are two sources of
expectations:
- from our experience seeing repetitions and patterns in the values
(male/female/etc between 10 and 50), which belong to the property
- from the agreed definition of the concept itself, which belong to the data


Yes. I agree with this as a basic dichotomy of things we may want to 
record in Wikidata. Some things are true by definition, while others are 
just very likely by observation. The exact population of Paris we will 
never know, but we are completely sure that a piano is an instrument. 
(Maybe somebody with a better philosophical background than me could 
give a better perspective of these notions -- analytical vs. 
empirical come to mind, but I am sure there is more.)


Some important ideas like classification (instance of/subclass of) 
belong completely to the analytical realm. We don't observe classes, we 
define them. A planet is what we call a planet, and this can change 
even if the actual lumps in space are pretty much the same.


However, there is yet a deeper level here (you asked for it ;-). 
Wikidata is not about facts but about statements with references. We do 
not record Pluto was a planet until 2006 but Pluto was a planet until 
2006 *according to the IAU*. Likewise, we don't say Berlin has 3 
million inhabitants but Berlin has 3 million inhabitants *according to 
the Amt fuer Statistik Berlin-Brandenburg*. If you compare these two 
statements, you can see that they are both empirical, based on our 
observation of a particular reference. We do not have analytical 
knowledge of what the IAU or the Amt fuer Statistic might say. So in 
this sense constraints can only ever be rough guidelines. It does not 
make logical sense to say if source A says X then source B must say Y 
-- even if we know that X implies Y (maybe by definition), we don't know 
what sources A and B say. All we can do with constraints it to uncover 
possible contradictions between sources, which might then be looked into.


Now inferences are slightly different. If we know that X implies Y, then 
if A says X we can infer that (implicitly) A says Y. That is a 
logical relationship (or rule) on the level of what is claimed, rather 
than on the level of statements. Note that we still need to have a way 
to find out that X implies Y, which is a content-level claim that 
should have its own reference somewhere. We mainly use inference in this 
sense with subclass of in reasonator or when checking constraints. In 
this case, the implications are encoded as subclass-of statements (If X 
is a piano, then X is an instrument). This allows us to have references 
on the implications.


In general, an interesting question here is what the status of subclass 
of really is. Do we gather this information from external sources 
(surely there must be a book that tells us that pianos are instruments) 
or do we as a community define this for Wikidata (surely, the overall 
hierarchy we get is hardly the universal class hierarchy of the world 
but a very specific classification that is different from other 
classifications that may exist elsewhere)? Best not to think about it 
too much and to gather sources whenever we have them ;-)


Besides these two notions (constraints to uncover inconsistent 
references, and logical axioms to derive new statements from given 
ones), there is also a third type of constraint that is purely 
analytical. If we *define* that our 

Re: [Wikidata-l] What is the point of properties?

2014-05-28 Thread David Cuenca
Markus,


On Thu, May 29, 2014 at 12:53 AM, Markus Krötzsch 
mar...@semantic-mediawiki.org wrote:

 This is an easy question once you have been clear about what human
 behaviour is. According to enwiki, it is a range of behaviours *exhibited
 by* humans.


Settled :) Let's leave it at defined as a trait of


 What would anybody do with this data? In what application could it be of
 interest?


Well, our goal it to gather the whole human knowledge, not to use it. I can
think of several applications, but let's leave that open. Never
underestimate human creativity ;-)



 Moreover, as a great Icelandic ontologist once said: There is definitely,
 definitely, definitely no logic, to human behaviour ;-)


Definitely, that is why we spend so much time in front of flickering
squares making them flicker even more. It makes total sense :P



 I think constraints are already understood in this way. The name comes
 from databases, where a constraint violation is indeed a rather hard
 error. On the other hand, ironically, constraints (as a technical term) are
 often considered to be a softer form of modelling than (onto)logical
 axioms: a constraint can be violated while a logical axiom (as the name
 suggests) is always true -- if it is not backed by the given data, new data
 will be inferred. So as a technical term, constraint is quite appropriate
 for the mechanism we have, although it may not be the best term to clarify
 the intention.


Ok, I will not fight traditional labels nor conventions. I was interested
in pointing out to the inappropriateness of using a word inside our
community with a definition that doesn't matches its use, when there is
another word that matches perfectly and conveys its meaning better to users.

Some important ideas like classification (instance of/subclass of) belong
 completely to the analytical realm. We don't observe classes, we define
 them. A planet is what we call a planet, and this can change even if the
 actual lumps in space are pretty much the same.


Agreed. Better labels could be defined as instance of/defined as
subclass of


 Now inferences are slightly different. If we know that X implies Y, then
 if A says X we can infer that (implicitly) A says Y. That is a logical
 relationship (or rule) on the level of what is claimed, rather than on the
 level of statements. Note that we still need to have a way to find out that
 X implies Y, which is a content-level claim that should have its own
 reference somewhere. We mainly use inference in this sense with subclass
 of in reasonator or when checking constraints. In this case, the
 implications are encoded as subclass-of statements (If X is a piano, then
 X is an instrument). This allows us to have references on the implications.


Nope, nope, nope. I was not referring to hard implications, but to
heuristic ones.

Consider that these properties in the item namespace:
defined as a trait of
defined as having
defined as instance of

Would translate as these constraints in the property namespace:
likely to be a trait of
likely to have
likely to be an instance of



 In general, an interesting question here is what the status of subclass
 of really is. Do we gather this information from external sources (surely
 there must be a book that tells us that pianos are instruments) or do we as
 a community define this for Wikidata (surely, the overall hierarchy we get
 is hardly the universal class hierarchy of the world but a very specific
 classification that is different from other classifications that may exist
 elsewhere)? Best not to think about it too much and to gather sources
 whenever we have them ;-)


I think it is good to think about it and to consider options to deal with
it. Like for instance:
defined as instance of corresponds with item Wikimedia community
concept
We already have items that refer to concepts that only make sense for us,
so no change in that regard.

At the moment, hard constraints (from definitions) and soft constraints
 (expectations) are simply mixed, and maybe this is fine since we handle
 them in a similar fashion (humans need to look how to fix the situation).
 Most constraints, even those that refer to definitions, are rather soft
 anyway since we apply them to statements, not to hard facts. Hard
 constraints can only occur in cases where the *encoding* of a statement in
 Wikidata is wrong (not the intended statement as such, but how it was
 translated to data).


As explained above, expectations inferred from definitions should not be
treated as hard constraints, but as soft ones.

Micru
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l