Re: [Wikidata] WIkidata reasoning (Was: Properties for family relationships in Wikidata)
On fim 27.ágú 2015 15:52, Markus Krötzsch wrote: > On 27.08.2015 14:43, Svavar Kjarrval wrote: >> So far from the other thread, the current need seems to be for two types >> of definitions: >> 1. How to interpret declarations depending on associated properties. > > If I understand your explanations correctly, the first point is a very > specific case of inference, which is already thinking in terms of > "hierarchies" (of some property). I am asking: how do we even know > that some properties are supposed to be read as forming a "hierarchy". > This is one special case of a rule of inference that one might > formulate. Have a look at > > https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning/Use_cases > > for some more examples of what could be relevant inferences. As you > can see, only few of these cases have anything to do with hierarchies > (subclass of in particular), but one could easily come up with similar > rules to express that something should be propagated along a hierarchy > (in some cases). > > > 2. Constraints (or suggestions) when interpreting multiple items. > > For me, a constraint is a rule that infers a Warning. It can follow a > similar pattern as the examples I gave, but instead of deriving a new > statement, it will derive that a human should better look at a > particular piece of our data to check if it is meaningful. > > There is no huge theoretical challenge involved here, but a big > practical one. I expect that we will refine our rules once we > encounter cases where they do not yield the right result. If you look > at the examples I gave, they are all mostly based on how we choose to > define the meaning of our properties. This is different from our > current constraints that specify how thing *usually* are in the world. > We can have both (constraints that warn us of unusual situations and > rules that derive statements) based on similar technology, but > different considerations are relevant when defining these two types of > things. > > As for Stubbs, there is a strong and a weak rule involved: > * Strong: all mayors are persons (I assume now that this class > encompasses named animals, as suggested in earlier messages; if not, > then replace "person" by a suitable generalisation that does). > * Weak: most mayors are humans. > > The strong version could probably be applied to derive new > information, without danger of "exceptions" -- it would be part of our > characterisation of what makes something a "person" in our view (or > whatever other class we pick there). The weak version should only be > used to find potential problems that humans might want to check. > > Similar rules exist in many domains: > * Strong: All birds are animals (it's part of how we define "bird") > * Weak: All birds can fly (it's something we observe for actual birds, > but not part of the definition of what it means to be a bird). > > I suggest we start by focussing on strong rules, since they make a big > contribution to documenting what we mean (by "person", by "bird", > etc.), even before we have any tool support for acting on this > information. > > Cheers, > > Markus I'm a big advocate of strong versions. My suggestions for "exceptions" was practical since we can't reasonably expect all data to be consistent with strong definitions. Personally I wouldn't support weak versions when a feasable strong version alternate would be available. The constraints I had in mind are only suggestive and would only serve as warnings, so I think we agree there. The constraints wouldn't be enforced but rather used to detect potential mistakes in the data. It wouldn't prevent someone adding the information that Stubbs is a mayor when it would lead to the contradiction of him being both a human and a cat. Regarding your question of my former definition, the point is to serve as a classification of what can be reasonably inferred from the relationship of two items, depending on the property used to connect them. Like in the case of Stubbs. Stubbs is a mayor and from that connection we can (or should be able to) to assume Stubbs is also a public official, a head of government and a politician. However, we shouldn't reasonably be able to assume Stubb's Freebase identifier is the same as for the town. The purpose is to enable machines to retrieve an item and extract all the relevant facts which can be reasonable inferred based on the relationship of that item with other items, recursively, until all the branches are exhausted. - Svavar Kjarrval signature.asc Description: OpenPGP digital signature ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Trends in links from Wikidata items to Commons
In terms of navigation from article-items to Commons categories, the policy is very straightforward: set and use the P373 property. This property also makes the inverse very straightforward, to go from a Commons category to a Wikidata item: use the script https://commons.wikimedia.org/wiki/User:TheDJ/wdcat.js or the tweaked version at https://commons.wikimedia.org/wiki/User:Jheald/wdcat.js which handles diacritics properly. These scripts automatically add a Reasonator link to the Commons category whenever there is a Wikidata article-like item pointing to it with a P373. What we have at the moment is the worst of all worlds -- namely inconsistency which is getting worse. As a result people don't know what to do, and they are not setting the P373 property -- with the result that scripts and queries don't find the connections that they should. What we need is clarity and systematic consistency. Then it is an easy step to adjust the user-presentation to do the right thing. -- James. On 27/08/2015 14:03, Romaine Wiki wrote: No we have not a clear policy on only linking sitelinks to categories if the item itself is about a category. So not let's not break that. You suggest to break down almost the complete navigational structure Commons has in relationship with Wikipedia, and makes it possible to find articles that are about the same subject as the category. Without it becomes almost impossible to identify a category on Commons to be related to an article in Wikipedia. Sorry, but your proposal is insane and making the navigational situation a thousand times worse. And does it make anything better? No, totally not. Only the opposite: worse. Wikidata is currently heavily used to connect categories on Commons to articles on Wikipedia. This so that interwikilinks are shown on the category on Commons to the related Wikipedia article. This for navigational purposes but also to uniquely identify categories on Commons to articles on Wikipedia and items on Wikidata. How nice Commons galleries are giving an overview, they are crap in speaking of navigational purposes. For every subject a category on Commons is created and used and the Commons categories form the backbone to media categories. It has been pointed out for a long time that the linking situation on Commons is problematic and this is a software issue, not a user side issue. This consists out of: * There can only be added one sitelink to an item. * If no sitelink added (but only added as property), a Commons category can't show the interwikilinks. * If a category and an article on Wikipedia/etc exist for a subject, only one of them can be shown on the Commons category. The annoying part is that some large wikis, especially the English Wikipedia, creates too many categories that are not created on other Wikipedias. This causes that categories on Commons are only linked to a category on Wikipedia, which is useless for most other wikis and on Commons we miss an interwikilink to the related article. A gallery on Commons is a great way as alternative to show images, but is not suitable for navigational purposes, as that requires a much higher coverage and being a backbone everything relies on. On Commons only categories have that function. A counter proposal makes more sense: no Commons galleries as sitelinks any more and having Commons galleries only as property added. But this only solves a part of the problem: on Commons I would like to see somehow that both the related category as the related article are shown. Example: on the Commons category for a specific country both the country category on Wikipedia is linked as the article on Wikipedia is linked. Something I have been wondering about for a long time is why there are 2 places on an item where a Commonscat is added. I understand the development and technical behind it, but this should not be needed. So the developers of Wikidata should try to find a way to show both groups of interwikilinks on categories on Commons. As long as this is not resolved in software, this problem of 2 items both strongly related to a Commons category keeps an issue. Romaine 2015-08-27 11:29 GMT+02:00 James Heald : A few days ago I made the following post to Project Chat, looking at how people are linking from Wikidata items to Commons categories and galleries compared to a year ago, that some people on the list may have seen, which has now been archived: https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2015/08#Trends_in_links_from_items_to_Commons A couple of headlines: * Category <-> commonscat identifications : ** There was a net increase of 61,784 Commons categories that can now be identified with category-like items, to 323,825 Commons categories in all ** 96.4% of category <-> commonscat identifications (312,266 items) now have sitelinks. This represents a rise in sitelinks (60,463 items) amounting to 97.8% of the increase in identifications ** 80.0%
Re: [Wikidata] WIkidata reasoning (Was: Properties for family relationships in Wikidata)
On 27.08.2015 14:43, Svavar Kjarrval wrote: So far from the other thread, the current need seems to be for two types of definitions: 1. How to interpret declarations depending on associated properties. If I understand your explanations correctly, the first point is a very specific case of inference, which is already thinking in terms of "hierarchies" (of some property). I am asking: how do we even know that some properties are supposed to be read as forming a "hierarchy". This is one special case of a rule of inference that one might formulate. Have a look at https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning/Use_cases for some more examples of what could be relevant inferences. As you can see, only few of these cases have anything to do with hierarchies (subclass of in particular), but one could easily come up with similar rules to express that something should be propagated along a hierarchy (in some cases). > 2. Constraints (or suggestions) when interpreting multiple items. For me, a constraint is a rule that infers a Warning. It can follow a similar pattern as the examples I gave, but instead of deriving a new statement, it will derive that a human should better look at a particular piece of our data to check if it is meaningful. There is no huge theoretical challenge involved here, but a big practical one. I expect that we will refine our rules once we encounter cases where they do not yield the right result. If you look at the examples I gave, they are all mostly based on how we choose to define the meaning of our properties. This is different from our current constraints that specify how thing *usually* are in the world. We can have both (constraints that warn us of unusual situations and rules that derive statements) based on similar technology, but different considerations are relevant when defining these two types of things. As for Stubbs, there is a strong and a weak rule involved: * Strong: all mayors are persons (I assume now that this class encompasses named animals, as suggested in earlier messages; if not, then replace "person" by a suitable generalisation that does). * Weak: most mayors are humans. The strong version could probably be applied to derive new information, without danger of "exceptions" -- it would be part of our characterisation of what makes something a "person" in our view (or whatever other class we pick there). The weak version should only be used to find potential problems that humans might want to check. Similar rules exist in many domains: * Strong: All birds are animals (it's part of how we define "bird") * Weak: All birds can fly (it's something we observe for actual birds, but not part of the definition of what it means to be a bird). I suggest we start by focussing on strong rules, since they make a big contribution to documenting what we mean (by "person", by "bird", etc.), even before we have any tool support for acting on this information. Cheers, Markus The first definition is used so the machine can know *if* the declaration is up in the hierarchy or sideways. When interpreting the item, the machine needs to know if the property implies that all declarations of that item are inhereted. If we take some currently living human as an example who has a Wikidata item and that human is connected to an occupation via a property. The machine should know if it should process the declarations of the occupation to apply them to the human, in whole or partially. Then there are properties which don't inheret, like if the human has a declared family member, the human doesn't inherit the other family member's name or birthdate. The other definition has the purpose of solving contradictions like in my example of Stubbs. If we are realistic, it's not likely that a tree structure with that much data is totally free of contradictions. So we need to have some way of telling the machine that there are, or could be, contradictions. One example of this to define that a certain property can't be more than one of something (at any given time). For simplification (not referring to the current data structure) is that a human is a part of a certain species. If we were to define, in this case, that any item can't be part of more than one species, then the machine would detect a contradiction. In the specific example of Stubbs, the machine would determine that cats and humans are two separate species and there can be only one[1]. If we had a definition that the declaration closer to the item in the specific link has precedence, then the machine would solve it by determining that mayors are generally humans but Stubbs being a cat is an exception to that rule. [1] Didn't see the Highlander reference until I had written it. - Svavar Kjarrval ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata ___
Re: [Wikidata] WIkidata reasoning (Was: Properties for family relationships in Wikidata)
A human is not a part of a species, it is an instance of a species :) Contradiction management is a very interisting topic, and contradictions are inherent to Wikidata model. We can't expect everything is consistent considering Wikidata only reflects sources, and that 2 sources can disagree in an essentially inconsistent way. We could expect however that several statements extracted from the same source should be consistent themselves, but it might be rare that we will have enough statements that will be sourced to draw useful inferences. This can lead to subproblems like computing the maximum set of consistent sources on a part of the graph or finding the sources that leads to contradiction when took together. However, we already have qualifiers that marks a source in contradiction with another : "statement disputed by". We could assume that the sources involved are probably inconsistent with each other. Or we could simply drop the consistency checks out of the inference way :) And leave it to the constraint system : if an inference draws a path that leads to constraint violation, then community will be notified. To avoid explosion, the scope of inerences could be limited (not trying to compute the transitive closure of the inferences rules application). We could use some sort of "partial consistency" notion, such as those used in constraint programming. Thinking about it I can imagine constraint problems such as "considering an inference I deduced some way, is it fully consistent with the set of sources we have, or is there a set of sources that implies the inference is not true ?" -> Is the inference a tautology or is the infererence only satisfiable in a problem where each statements maps to a variable, the different sources are values for the domain of the variables, and the sources must be consistent wrt. what we know they says on Wikidata ? 2015-08-27 14:43 GMT+02:00 Svavar Kjarrval : > So far from the other thread, the current need seems to be for two types > of definitions: > 1. How to interpret declarations depending on associated properties. > 2. Constraints (or suggestions) when interpreting multiple items. > > The first definition is used so the machine can know *if* the > declaration is up in the hierarchy or sideways. When interpreting the > item, the machine needs to know if the property implies that all > declarations of that item are inhereted. If we take some currently > living human as an example who has a Wikidata item and that human is > connected to an occupation via a property. The machine should know if it > should process the declarations of the occupation to apply them to the > human, in whole or partially. Then there are properties which don't > inheret, like if the human has a declared family member, the human > doesn't inherit the other family member's name or birthdate. > > The other definition has the purpose of solving contradictions like in > my example of Stubbs. If we are realistic, it's not likely that a tree > structure with that much data is totally free of contradictions. So we > need to have some way of telling the machine that there are, or could > be, contradictions. One example of this to define that a certain > property can't be more than one of something (at any given time). For > simplification (not referring to the current data structure) is that a > human is a part of a certain species. If we were to define, in this > case, that any item can't be part of more than one species, then the > machine would detect a contradiction. In the specific example of Stubbs, > the machine would determine that cats and humans are two separate > species and there can be only one[1]. If we had a definition that the > declaration closer to the item in the specific link has precedence, then > the machine would solve it by determining that mayors are generally > humans but Stubbs being a cat is an exception to that rule. > > [1] Didn't see the Highlander reference until I had written it. > > - Svavar Kjarrval > > > ___ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > > ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Trends in links from Wikidata items to Commons
No we have not a clear policy on only linking sitelinks to categories if the item itself is about a category. So not let's not break that. You suggest to break down almost the complete navigational structure Commons has in relationship with Wikipedia, and makes it possible to find articles that are about the same subject as the category. Without it becomes almost impossible to identify a category on Commons to be related to an article in Wikipedia. Sorry, but your proposal is insane and making the navigational situation a thousand times worse. And does it make anything better? No, totally not. Only the opposite: worse. Wikidata is currently heavily used to connect categories on Commons to articles on Wikipedia. This so that interwikilinks are shown on the category on Commons to the related Wikipedia article. This for navigational purposes but also to uniquely identify categories on Commons to articles on Wikipedia and items on Wikidata. How nice Commons galleries are giving an overview, they are crap in speaking of navigational purposes. For every subject a category on Commons is created and used and the Commons categories form the backbone to media categories. It has been pointed out for a long time that the linking situation on Commons is problematic and this is a software issue, not a user side issue. This consists out of: * There can only be added one sitelink to an item. * If no sitelink added (but only added as property), a Commons category can't show the interwikilinks. * If a category and an article on Wikipedia/etc exist for a subject, only one of them can be shown on the Commons category. The annoying part is that some large wikis, especially the English Wikipedia, creates too many categories that are not created on other Wikipedias. This causes that categories on Commons are only linked to a category on Wikipedia, which is useless for most other wikis and on Commons we miss an interwikilink to the related article. A gallery on Commons is a great way as alternative to show images, but is not suitable for navigational purposes, as that requires a much higher coverage and being a backbone everything relies on. On Commons only categories have that function. A counter proposal makes more sense: no Commons galleries as sitelinks any more and having Commons galleries only as property added. But this only solves a part of the problem: on Commons I would like to see somehow that both the related category as the related article are shown. Example: on the Commons category for a specific country both the country category on Wikipedia is linked as the article on Wikipedia is linked. Something I have been wondering about for a long time is why there are 2 places on an item where a Commonscat is added. I understand the development and technical behind it, but this should not be needed. So the developers of Wikidata should try to find a way to show both groups of interwikilinks on categories on Commons. As long as this is not resolved in software, this problem of 2 items both strongly related to a Commons category keeps an issue. Romaine 2015-08-27 11:29 GMT+02:00 James Heald : > A few days ago I made the following post to Project Chat, looking at how > people are linking from Wikidata items to Commons categories and galleries > compared to a year ago, that some people on the list may have seen, which > has now been archived: > > > https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2015/08#Trends_in_links_from_items_to_Commons > > > A couple of headlines: > > * Category <-> commonscat identifications : > > ** There was a net increase of 61,784 Commons categories that can now be > identified with category-like items, to 323,825 Commons categories in all > > ** 96.4% of category <-> commonscat identifications (312,266 items) now > have sitelinks. This represents a rise in sitelinks (60,463 items) > amounting to 97.8% of the increase in identifications > > ** 80.0% of category <-> commonscat identifications (259,164 items) now > have P373 statements. This represents a rise in P373 statements (8,774 > items) amounting to 14.2% of the increase in identifications > > > * Article <-> commonscat identifications : > > ** There was a net increase of 176,382 Commons categories that can now be > identified with article-like items, to 884,439 Commons categories in all > > ** 23.4% of article <-> commonscat identifications (207,494 items) now > have (deprecated) sitelinks. This represents a rise in sitelinks (112,595 > items) amounting to 63.8% of the increase in identifications. > > ** 91.3% of article <-> commonscat identifications (807,776 items) now > have P373 statements. This represents a rise in P373 statements (110,727 > items) amounting to 62.8% of the increase in identifications > > > * In addition, a recent RfC showed considerable confusion as to what > actually was the current operational Wikidata policy on sitelinks to > Commons: > > > https://www.wikidata.org/wiki/Wikidata:Reques
Re: [Wikidata] WIkidata reasoning (Was: Properties for family relationships in Wikidata)
So far from the other thread, the current need seems to be for two types of definitions: 1. How to interpret declarations depending on associated properties. 2. Constraints (or suggestions) when interpreting multiple items. The first definition is used so the machine can know *if* the declaration is up in the hierarchy or sideways. When interpreting the item, the machine needs to know if the property implies that all declarations of that item are inhereted. If we take some currently living human as an example who has a Wikidata item and that human is connected to an occupation via a property. The machine should know if it should process the declarations of the occupation to apply them to the human, in whole or partially. Then there are properties which don't inheret, like if the human has a declared family member, the human doesn't inherit the other family member's name or birthdate. The other definition has the purpose of solving contradictions like in my example of Stubbs. If we are realistic, it's not likely that a tree structure with that much data is totally free of contradictions. So we need to have some way of telling the machine that there are, or could be, contradictions. One example of this to define that a certain property can't be more than one of something (at any given time). For simplification (not referring to the current data structure) is that a human is a part of a certain species. If we were to define, in this case, that any item can't be part of more than one species, then the machine would detect a contradiction. In the specific example of Stubbs, the machine would determine that cats and humans are two separate species and there can be only one[1]. If we had a definition that the declaration closer to the item in the specific link has precedence, then the machine would solve it by determining that mayors are generally humans but Stubbs being a cat is an exception to that rule. [1] Didn't see the Highlander reference until I had written it. - Svavar Kjarrval signature.asc Description: OpenPGP digital signature ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
[Wikidata] WIkidata reasoning (Was: Properties for family relationships in Wikidata)
[Splitting the general (Wikidata reasoning; this thread) from the specific (Wikidata family relationships for horses; original thread).] Many issues have been brought up, and we cannot solve all with one big hammer. I have now started a WikiProject (see below) to address one of the key points raised by Peter: ''' Nobody has ever defined which inferences can/should be drawn from the content of Wikidata. ''' We do in fact use several properties that seem to ask for inferencing. Probably the clearest is "subclass of" (P279). It has been related to rdfs:subClassOf in many community discussions, so it seems clear that a similar meaning is intended. This would lead to the following rule: ''' If an item A has a "subclass of" statement with value B, and if item B has a "subclass of" statement with value C, then it should follow that item A has a "subclass of" statement with value C." ''' I think there is wide agreement on this idea. Constraints rely on it (constraint checking travels the P279 hierarchy), and it's a main motivation for why Wikidata Query has its "tree" feature. There are similarly clear intentions for the properties "instance of" (P31) and "subproperty of" (P1647). I am not spelling them out here. Nevertheless, Peter is right that even in these cases, the intention is not fully clear, because of two reasons: (1) There is no machine-readable specification of the intended behaviour. It's part of user discussions, not of the data or templates. Even the user discussions are distributed over several pages, so a lot of wiki archaeology is needed to get a full picture of what we, the community, might have intended. (2) The informal discussions on the intended semantics are not precise about all relevant cases. Many questions remain open, such as what to do if qualifiers are used on a statement (rarely the case for "subclass of", but not so uncommon for "instance of"). To address these issues, I propose to come up with a format that allows us to clearly specify inference rules such as the one for "subclass of" above. Each rule should have one page where it is specified (for humans and machines), explained (to humans), and discussed. It is not possible to encode such rules as property values on data pages (for a start, it would not be clear which page this should be on, because rules typically refer to several properties and items). Therefore, the best we could do now seems to have standard wiki pages for this. They could be linked from all relevant properties/items (talk pages) though. Even if we do not have any reasoner to compute all the results, writing down the intended rules would be useful documentation for other users to clarify what we expect (see the original family relationship discussion). I propose to start by gathering use cases, that is, examples of rules that we might want to express. From this, we can then extract suitable template structure. I have created a WikiProject for getting us started: https://www.wikidata.org/wiki/Wikidata:WikiProject_Reasoning Feel free to contribute. Best regards, Markus On 27.08.2015 06:26, Peter F. Patel-Schneider wrote:> > > On 08/26/2015 06:01 PM, Svavar Kjarrval wrote: >> On mið 26.ágú 2015 23:05, James Heald wrote: >>> There are a *lot* of problems with P279 (subclass), right across >>> Wikidata. >>> >>> These will only be corrected once people start doing searches in a >>> systematic way and addressing the anomalies they find. >>> >>> In this case, politician (Q82955) should *not* be a subclass of human >>> (Q5), instead it should be a subclass of something like occupation >>> (Q13516667), or alternatively perhaps profession (Q28640). >>> >>> >>> My understanding is that currently there are a vast number of >>> incorrect subclass relationships in the project, messing up tree >>> searches, and so far it is something that has simply not yet been >>> systematically addressed. >>> >>>-- James. >>> >>> >> For now, what's the best way to find (and perhaps correct) incorrect >> declarations like these? >> >> If I were to just change items for commonly used items like politician >> (Q82955) it might be construed as vandalism or someone who doesn't care >> about or understand the Stubbs-declared-as-a-human problem might just >> add that declaration back later. >> >> When it comes to the gender property (P21), the human readable >> description indicates that it's to define genders in general, yet it's >> declared as an instance of an item (Q18608871) which only applies to >> humans, which of course has consequences further up in the hierarchy >> since the maintainers of item Q18608871 faithfully assume it only >> applies to humans. > > Well, the situation with respect to Wikidata property for items about people > (Q18608871) is very difficult. There is absolutely no machine-interpretable > information associated with this class that can be used to deterimine that > instances of it are only suppo
[Wikidata] Trends in links from Wikidata items to Commons
A few days ago I made the following post to Project Chat, looking at how people are linking from Wikidata items to Commons categories and galleries compared to a year ago, that some people on the list may have seen, which has now been archived: https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/2015/08#Trends_in_links_from_items_to_Commons A couple of headlines: * Category <-> commonscat identifications : ** There was a net increase of 61,784 Commons categories that can now be identified with category-like items, to 323,825 Commons categories in all ** 96.4% of category <-> commonscat identifications (312,266 items) now have sitelinks. This represents a rise in sitelinks (60,463 items) amounting to 97.8% of the increase in identifications ** 80.0% of category <-> commonscat identifications (259,164 items) now have P373 statements. This represents a rise in P373 statements (8,774 items) amounting to 14.2% of the increase in identifications * Article <-> commonscat identifications : ** There was a net increase of 176,382 Commons categories that can now be identified with article-like items, to 884,439 Commons categories in all ** 23.4% of article <-> commonscat identifications (207,494 items) now have (deprecated) sitelinks. This represents a rise in sitelinks (112,595 items) amounting to 63.8% of the increase in identifications. ** 91.3% of article <-> commonscat identifications (807,776 items) now have P373 statements. This represents a rise in P373 statements (110,727 items) amounting to 62.8% of the increase in identifications * In addition, a recent RfC showed considerable confusion as to what actually was the current operational Wikidata policy on sitelinks to Commons: https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Category_commons_P373_and_%22Other_sites%22 In view of the trends above; and the need for predictability and consistency for queries and templates and scripts to depend on; and particularly in view of the apparent confusion as to what the operational policy currently actually is, can I suggest that the time has come for a bot to monitor all new sitelinks to Commons categories, * adding a corresponding P373 statement if there is not one already, and * removing the sitelink if it is from an article-like item to a commonscat. I believe we have clear policy on only sitelinking commons categories to category-like items, and commons galleries to article-like items; but there is currently confusion and unpredictability being caused because these relationships are not being enforced -- breaking scripts and queries. It's time to fix this. All best, James. ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Properties for family relationships in Wikidata
Hoi, Absolutely.. When full genealogy information is available, you do not need special words that indicate whatever. It is only when this is not the case that you need to specify what type of link there is. This can be specific like maternal uncle or paternal aunt. This makes a practical difference in several cultures and is THEREFORE significant. Again, it is only of relevance when it cannot be inferred. Thanks, GerardM On 27 August 2015 at 11:08, Marielle Volz wrote: > If you want to find all humans on wikidata, find all items with the > property "instance of" (p35) equal to "human" (q5). There is no need > to infer this from things like having the parent property, that's a > terrible way to do things. Items that are instances of different items > use the same properties all the time, you shouldn't be inferring > anything about the class of an item based on the properties it has. > > If you are worried about horses being put in a genealogical tree with > humans, that would require someone to put a horse as a parent of a > human or vice versa. That's an problem with an invalid relationship > being added, not the property itself. > > On Wed, Aug 26, 2015 at 6:43 PM, Svavar Kjarrval > wrote: > > > > > > On mið 26.ágú 2015 13:58, Peter F. Patel-Schneider wrote: > >> I don't think that P21 (https://www.wikidata.org/wiki/Property:P21, > sex or > >> gender) is a subclass of P31 ( > https://www.wikidata.org/wiki/Property:P31, > >> instance of). Properties aren't subclasses in general. > >> > >> Perhaps you meant to talk about > https://www.wikidata.org/wiki/Property:P21 > >> (sex or gender) being related via ( > https://www.wikidata.org/wiki/Property:P31 > >> (instance of) to https://www.wikidata.org/wiki/Q18608871 (Wikidata > property > >> for items about people). This indicates that the property should only > be > >> used on people, even though the description of the property itself > talks about > >> its use on animals. > >> > >> It appears that Wikidata is not very consistent internally. > >> > >> peter > >> > > Sorry, I'm not used to the Wikidata lingo. > > > > To further explain my point (to which I think you have already agreed > to): > > If I were to produce a code which makes assumptions based on such > > relations, the code would come to the contradiction that a non-human > > with a P21 relation is a human, if it were to recursively travel via in > > the hierarchy of declarations. P21 is declared with a P31->Q18608871 and > > Q18608871 is in turn declared P1269->Q5. Unless special precautions > > would be taken, anyone trying to generate an exhaustive list of all > > humans on Wikidata (without relying solely on the direct declaration on > > each item), they might find themselves with non-humans on that list due > > to travelling backwards via such relations. > > > > In essence, it seems like P21 either wrongfully allows definitions of > > genders of non-humans or that the property is too broad for a > > declaration of P31->Q18608871. > > > > - Svavar Kjarrval > > > > > > ___ > > Wikidata mailing list > > Wikidata@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikidata > > > > ___ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Properties for family relationships in Wikidata
If you want to find all humans on wikidata, find all items with the property "instance of" (p35) equal to "human" (q5). There is no need to infer this from things like having the parent property, that's a terrible way to do things. Items that are instances of different items use the same properties all the time, you shouldn't be inferring anything about the class of an item based on the properties it has. If you are worried about horses being put in a genealogical tree with humans, that would require someone to put a horse as a parent of a human or vice versa. That's an problem with an invalid relationship being added, not the property itself. On Wed, Aug 26, 2015 at 6:43 PM, Svavar Kjarrval wrote: > > > On mið 26.ágú 2015 13:58, Peter F. Patel-Schneider wrote: >> I don't think that P21 (https://www.wikidata.org/wiki/Property:P21, sex or >> gender) is a subclass of P31 (https://www.wikidata.org/wiki/Property:P31, >> instance of). Properties aren't subclasses in general. >> >> Perhaps you meant to talk about https://www.wikidata.org/wiki/Property:P21 >> (sex or gender) being related via (https://www.wikidata.org/wiki/Property:P31 >> (instance of) to https://www.wikidata.org/wiki/Q18608871 (Wikidata property >> for items about people). This indicates that the property should only be >> used on people, even though the description of the property itself talks >> about >> its use on animals. >> >> It appears that Wikidata is not very consistent internally. >> >> peter >> > Sorry, I'm not used to the Wikidata lingo. > > To further explain my point (to which I think you have already agreed to): > If I were to produce a code which makes assumptions based on such > relations, the code would come to the contradiction that a non-human > with a P21 relation is a human, if it were to recursively travel via in > the hierarchy of declarations. P21 is declared with a P31->Q18608871 and > Q18608871 is in turn declared P1269->Q5. Unless special precautions > would be taken, anyone trying to generate an exhaustive list of all > humans on Wikidata (without relying solely on the direct declaration on > each item), they might find themselves with non-humans on that list due > to travelling backwards via such relations. > > In essence, it seems like P21 either wrongfully allows definitions of > genders of non-humans or that the property is too broad for a > declaration of P31->Q18608871. > > - Svavar Kjarrval > > > ___ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Re: [Wikidata] Properties for family relationships in Wikidata
Well, I decided to be bold (that is often the road to reversion, but let's get the ball rolling): Tarok[1] now has Pay Dirt[2] as his father. B.B.S. Sugarlight[3] now has Sugarsweet Sid[4] as his mother, and she has Sugarcane Hanover[5] as her father. 1 https://www.wikidata.org/wiki/Q12338810 2 https://www.wikidata.org/wiki/Q12331109 3 https://www.wikidata.org/wiki/Q20872428 4 https://www.wikidata.org/wiki/Q20873813 5 https://www.wikidata.org/wiki/Q12003911 When I asked about this on Facebook, the first answer was "Random guess: Check out Secretariat. My guess is that it has been registered thoroughly." Now the quest is to connect Secretariat, Tarok and Sugarcane Hanover.. :-) On Wed, Aug 26, 2015 at 9:24 PM, Joe Filceolaire wrote: > Every other ontology mixes humans with fictional characters and with groups > of humans and possibly fictional humans (biblical characters for instance). > Wikidata has gone to a lot of trouble to try to untangle these into separate > classes. Anyone trying to get an exhaustive list of humans and not using > deserves everything he gets. > > P21 (sex or gender) is very explicitly specified as being usable for humans > and for other creatures. At the request of some languages we have separate > items for 'female human' and for 'female creature' (we have the same for > male), 'Female human' is 'subclass of:female creature'. Relying on P21 to > tell if something is or is not human is not recommended as it will probably > miss out all the humans who are neither male nor female - wikidata has about > a dozen other values that can be used with this property. > > Father (P22) and mother (P25) can perfectly well be used for non-humans and > if the current constraints on these properties flag this as a problem then > the constraints will have to be updated. I expect to see extensive pedigrees > for racehorses entered in Wikidata. Note that there is a proposal under > consideration to replace P22 and P25 with a single 'parent' property. > > Hope this helps > > Joe > > > On Wed, 26 Aug 2015 18:44 Svavar Kjarrval wrote: >> >> >> >> On mið 26.ágú 2015 13:58, Peter F. Patel-Schneider wrote: >> > I don't think that P21 (https://www.wikidata.org/wiki/Property:P21, sex >> > or >> > gender) is a subclass of P31 >> > (https://www.wikidata.org/wiki/Property:P31, >> > instance of). Properties aren't subclasses in general. >> > >> > Perhaps you meant to talk about >> > https://www.wikidata.org/wiki/Property:P21 >> > (sex or gender) being related via >> > (https://www.wikidata.org/wiki/Property:P31 >> > (instance of) to https://www.wikidata.org/wiki/Q18608871 (Wikidata >> > property >> > for items about people). This indicates that the property should only >> > be >> > used on people, even though the description of the property itself talks >> > about >> > its use on animals. >> > >> > It appears that Wikidata is not very consistent internally. >> > >> > peter >> > >> Sorry, I'm not used to the Wikidata lingo. >> >> To further explain my point (to which I think you have already agreed to): >> If I were to produce a code which makes assumptions based on such >> relations, the code would come to the contradiction that a non-human >> with a P21 relation is a human, if it were to recursively travel via in >> the hierarchy of declarations. P21 is declared with a P31->Q18608871 and >> Q18608871 is in turn declared P1269->Q5. Unless special precautions >> would be taken, anyone trying to generate an exhaustive list of all >> humans on Wikidata (without relying solely on the direct declaration on >> each item), they might find themselves with non-humans on that list due >> to travelling backwards via such relations. >> >> In essence, it seems like P21 either wrongfully allows definitions of >> genders of non-humans or that the property is too broad for a >> declaration of P31->Q18608871. >> >> - Svavar Kjarrval >> >> ___ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata > > > ___ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata > -- http://palnatoke.org * @palnatoke * +4522934588 ___ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata