The other answers, under the original subject:
On 29/05/14 01:48, David Cuenca wrote:
Settled :) Let's leave it at <defined as a trait of>
I don't think it is very clear what the intention of this property is.
What are the limits of its use? What is it meant to do? Can behaviour
really be a "trait" of a species? If we allow it here, it seems to apply
to all kinds of connections: density/car? eternity/time? time/reality?
evil/devil? rigour/science? -- this is opening a can of worms. It will
be hard to maintain this.
Wikiuser13 recently added "consists of: Neptune" to Q1. It was fixed.
But it is a good example of the kind of confusion that comes from such
general ontological (in the philosophical sense) properties. And
"consists of" is still very simple compared to "defined as a trait of".
Can't we focus on more obvious things like "has social network account"
for a while? ;-)
...
Some important ideas like classification (instance of/subclass of)
belong completely to the analytical realm. We don't observe classes,
we define them. A planet is what we call a "planet", and this can
change even if the actual lumps in space are pretty much the same.
Agreed. Better labels could be <defined as instance of>/<defined as
subclass of>
I don't think this is better. The short names are fine. As I explained
in my email, Wikidata statements are mainly about what the external
references say. The distinction between "defined" and "observed" is not
on the surface of this. The main question is "Did the reference say that
pianos are instruments?" but not "Did the reference say pianos are
instruments because of the definition of 'piano'?" Therefore, we don't
need to put this information in our labels.
Now inferences are slightly different. If we know that X implies Y,
then if "A says X" we can infer that (implicitly) "A says Y". That
is a logical relationship (or rule) on the level of what is claimed,
rather than on the level of statements. Note that we still need to
have a way to find out that "X implies Y", which is a content-level
claim that should have its own reference somewhere. We mainly use
inference in this sense with "subclass of" in reasonator or when
checking constraints. In this case, the implications are encoded as
subclass-of statements ("If X is a piano, then X is an instrument").
This allows us to have references on the implications.
Nope, nope, nope. I was not referring to "hard" implications, but to
heuristic ones.
Consider that these properties in the item namespace:
<defined as a trait of>
<defined as having>
<defined as instance of>
Would translate as these constraints in the property namespace:
<likely to be a trait of>
<likely to have>
<likely to be an instance of>
I think you might have misunderstood my email. I was arguing *in favour*
of soft constraints, but in the paragraph before the one about
inferences that you reply to here. Inferences are hard ways for
obtaining new knowledge from our own definitions. Example:
If X is the father of Y according to reference A
Then Y is the child of X according to reference A
This is as hard as it can get. We are absolutely sure of this since this
rule just explains the relationship between two different ways we have
for encoding family relationships.
Below, you said "expectations inferred from definitions should not be
treated as hard constraints" -- maybe this mixture of terms indicates
that I have not been clear enough about the distinction between
"inference" and "constraint". They are really completely different ways
of looking at things. Inferences are something that adds (inevitable)
conclusions to your knowledge, while constraints just tell you what to
check for. If you accept the premises of an inference and the inference
rule, then you must also accept the conclusion -- there is no "soft" way
of reading this. To make it soft, you can start to formalise "softness"
in your knowledge, using fuzzy logic or whatnot (see my other email with
Thomas).
I don't think we can use "soft inferences" (in the sense of fuzzy logic
et al.) but I am in favour of "soft constraints" (in the sense of your
"expectations"). I guess we agree on all of this, but have a bit of
trouble in making ourselves clear :-) But it is rather subtle material
after all.
In general, an interesting question here is what the status of
"subclass of" really is. Do we gather this information from external
sources (surely there must be a book that tells us that pianos are
instruments) or do we as a community define this for Wikidata
(surely, the overall hierarchy we get is hardly the "universal class
hierarchy of the world" but a very specific classification that is
different from other classifications that may exist elsewhere)? Best
not to think about it too much and to gather sources whenever we
have them ;-)
I think it is good to think about it and to consider options to deal
with it. Like for instance:
<defined as instance of> "corresponds with item" <Wikimedia community
concept>
We already have items that refer to concepts that only make sense for
us, so no change in that regard.
If you say this, then you are taking the position that instance of is
defined by the community rather than being taken from external sources.
My point was that such a position is not justified, given that there are
so many instance of relations with references (and even qualifiers). I
am unsure about the status of "subclass of" -- it could be considered a
community concept or a world concept. Maybe it's best to leave this to
applications that use the data.
(Btw. "corresponds with item" would be another unclear property that we
should better avoid.)
At the moment, hard constraints (from definitions) and soft
constraints (expectations) are simply mixed, and maybe this is fine
since we handle them in a similar fashion (humans need to look how
to fix the situation). Most constraints, even those that refer to
definitions, are rather soft anyway since we apply them to
statements, not to hard facts. Hard constraints can only occur in
cases where the *encoding* of a statement in Wikidata is wrong (not
the intended statement as such, but how it was translated to data).
As explained above, expectations inferred from definitions should not be
treated as hard constraints, but as soft ones.
As I said, hard and soft constraints would probably be treated in the
same way anyway (which is the soft way), so I guess we agree here. My
distinction of "hard" and "soft" constraints was about where we are
getting the constraints from: some constraints are merely "usually
satisfied in practice" (soft) while others are "requirements we have
defined for the use of our properties" (hard). We can treat them
similar, but it may still be good to understand where our "expectations"
come from in each case.
Markus
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l