The other answers, under the original subject:

On 29/05/14 01:48, David Cuenca wrote:

Settled :) Let's leave it at <defined as a trait of>

I don't think it is very clear what the intention of this property is. What are the limits of its use? What is it meant to do? Can behaviour really be a "trait" of a species? If we allow it here, it seems to apply to all kinds of connections: density/car? eternity/time? time/reality? evil/devil? rigour/science? -- this is opening a can of worms. It will be hard to maintain this.

Wikiuser13 recently added "consists of: Neptune" to Q1. It was fixed. But it is a good example of the kind of confusion that comes from such general ontological (in the philosophical sense) properties. And "consists of" is still very simple compared to "defined as a trait of". Can't we focus on more obvious things like "has social network account" for a while? ;-)

...
    Some important ideas like classification (instance of/subclass of)
    belong completely to the analytical realm. We don't observe classes,
    we define them. A planet is what we call a "planet", and this can
    change even if the actual lumps in space are pretty much the same.


Agreed. Better labels could be <defined as instance of>/<defined as
subclass of>

I don't think this is better. The short names are fine. As I explained in my email, Wikidata statements are mainly about what the external references say. The distinction between "defined" and "observed" is not on the surface of this. The main question is "Did the reference say that pianos are instruments?" but not "Did the reference say pianos are instruments because of the definition of 'piano'?" Therefore, we don't need to put this information in our labels.


    Now inferences are slightly different. If we know that X implies Y,
    then if "A says X" we can infer that (implicitly) "A says Y". That
    is a logical relationship (or rule) on the level of what is claimed,
    rather than on the level of statements. Note that we still need to
    have a way to find out that "X implies Y", which is a content-level
    claim that should have its own reference somewhere. We mainly use
    inference in this sense with "subclass of" in reasonator or when
    checking constraints. In this case, the implications are encoded as
    subclass-of statements ("If X is a piano, then X is an instrument").
    This allows us to have references on the implications.


Nope, nope, nope. I was not referring to "hard" implications, but to
heuristic ones.

Consider that these properties in the item namespace:
<defined as a trait of>
<defined as having>
<defined as instance of>

Would translate as these constraints in the property namespace:
<likely to be a trait of>
<likely to have>
<likely to be an instance of>

I think you might have misunderstood my email. I was arguing *in favour* of soft constraints, but in the paragraph before the one about inferences that you reply to here. Inferences are hard ways for obtaining new knowledge from our own definitions. Example:

If X is the father of Y according to reference A
Then Y is the child of X according to reference A

This is as hard as it can get. We are absolutely sure of this since this rule just explains the relationship between two different ways we have for encoding family relationships.

Below, you said "expectations inferred from definitions should not be treated as hard constraints" -- maybe this mixture of terms indicates that I have not been clear enough about the distinction between "inference" and "constraint". They are really completely different ways of looking at things. Inferences are something that adds (inevitable) conclusions to your knowledge, while constraints just tell you what to check for. If you accept the premises of an inference and the inference rule, then you must also accept the conclusion -- there is no "soft" way of reading this. To make it soft, you can start to formalise "softness" in your knowledge, using fuzzy logic or whatnot (see my other email with Thomas).

I don't think we can use "soft inferences" (in the sense of fuzzy logic et al.) but I am in favour of "soft constraints" (in the sense of your "expectations"). I guess we agree on all of this, but have a bit of trouble in making ourselves clear :-) But it is rather subtle material after all.



    In general, an interesting question here is what the status of
    "subclass of" really is. Do we gather this information from external
    sources (surely there must be a book that tells us that pianos are
    instruments) or do we as a community define this for Wikidata
    (surely, the overall hierarchy we get is hardly the "universal class
    hierarchy of the world" but a very specific classification that is
    different from other classifications that may exist elsewhere)? Best
    not to think about it too much and to gather sources whenever we
    have them ;-)


I think it is good to think about it and to consider options to deal
with it. Like for instance:
<defined as instance of> "corresponds with item" <Wikimedia community
concept>
We already have items that refer to concepts that only make sense for
us, so no change in that regard.

If you say this, then you are taking the position that instance of is defined by the community rather than being taken from external sources. My point was that such a position is not justified, given that there are so many instance of relations with references (and even qualifiers). I am unsure about the status of "subclass of" -- it could be considered a community concept or a world concept. Maybe it's best to leave this to applications that use the data.

(Btw. "corresponds with item" would be another unclear property that we should better avoid.)


    At the moment, hard constraints (from definitions) and soft
    constraints (expectations) are simply mixed, and maybe this is fine
    since we handle them in a similar fashion (humans need to look how
    to fix the situation). Most constraints, even those that refer to
    definitions, are rather soft anyway since we apply them to
    statements, not to hard facts. Hard constraints can only occur in
    cases where the *encoding* of a statement in Wikidata is wrong (not
    the intended statement as such, but how it was translated to data).


As explained above, expectations inferred from definitions should not be
treated as hard constraints, but as soft ones.

As I said, hard and soft constraints would probably be treated in the same way anyway (which is the soft way), so I guess we agree here. My distinction of "hard" and "soft" constraints was about where we are getting the constraints from: some constraints are merely "usually satisfied in practice" (soft) while others are "requirements we have defined for the use of our properties" (hard). We can treat them similar, but it may still be good to understand where our "expectations" come from in each case.

Markus


_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to