Re: [Wikidata-l] Wikidata for Wiktionary
Daniel's answer fits exactly with the proposal (which is unsurprising, because he reviewed and certainly influenced it). To make it clear again: the proposal on https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05 is a proposal for the tasks that need to be performed. Your questions are mostly about the data model, which was discussed earlier in the following proposal: https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2013-08 Since I am not sure which questions remain open, I will try to address them here again, on the risk of repeating what has been said before. Unfortunately you seem to not use the terminology as defined in the second proposal linked above, which makes the discussion unnecessarily harder than it could be. If you prefer another terminology, I would be happy if you link to a one pager describing it, so that we can effectively communicate. How do we go from a spelled form of a lexeme at Wiktionary and to an identifier on Wikidata? If with spelled form of a lexeme at Wiktionary you mean a Form as per the proposal, then the answer is: Forms have statements, and statements may point to Items, Forms, Senses, Lexemes, etc.. The exact properties to be used in these statements are up to the community. If with spelled form of a lexeme at Wiktionary you mean Lexeme as per the proposal, than the answer is: Lexems have statements, and statements may point to Items, Forms, Senses, Lexemes, etc. The exact properties to be used in these statements are up to the community. This is already stated in the second link above. And how do we go from one Sense to another synonym Sense? A Sense has a set of statements, and statements may point to other Senses. The exact properties used are up to the community. So a statement with the property 'synonym' stated on a Sense could point to another Sense. Do we use statements? Yes. But then only the L-identifiers can be used, so we will link them at the Lexeme level.. No. As the second link above says, Senses and Forms also have Statements. It is not only Lexemes that have Statements. Wiktionary is organized around homonyms while Wikipedia is organized around synonyms, especially across languages, and I think this difference creates some of the problems. Yes, that is why Tasks 1, 2, 9 and 10 in the proposal for the task breakdown, the first link above, deal with exactly this question. Since Gerard stated that his question was subsumed by the above list, I hope that his question is also answered? I am afraid that I could not write a new proposal which is significantly clearer than the current, but I can keep answering questions. But all the questions you have asked seem to be explicitly answered in the two links given above. Since I know you are smart, I am wondering what is not working in the communication right now. Did you miss the first link? Because without that it is indeed hard to fully understand the second link (but the first link is already given in the second link). So, please, keep asking questions. And everyone else too. I would like to continue improving the proposals based on your questions and suggestions. On Sat, May 16, 2015 at 3:46 PM John Erling Blad jeb...@gmail.com wrote: Your description is pretty far from whats in the proposal right now. The proposal is not clear at all, so I would say update it and resubmit if for a new discussion. On Sat, May 16, 2015 at 12:21 PM, Daniel Kinzler daniel.kinz...@wikimedia.de wrote: Am 15.05.2015 um 01:11 schrieb John Erling Blad: How do we go from a spelled form of a lexeme at Wiktionary and to an identifier on Wikidata? What do you mean by go to? And what do you mean by identifier on Wikidata - Items, Lexemes, Senses, or Forms? Generally, Wiktionary currently combines words with the same rendering from different languages on a single page. So a single Wiktionary page would correspond to several Lexeme entries on Wikidata, since Lexemes on wikidata would be split per language. I suppose a Lexeme-Entry could be linked back to the corresponding pages on the various Wiktionaries, but I don't really see the value of that, and sitelinks are currently not planned for Lexeme entries. It probably makes more sense for the Wiktionary pages to explicitly reference the Wikidata-Lexeme that corresponds to each language-section on the page. And how do we go from one Sense to another synonym Sense? Do we use statements? But then only the L-identifiers can be used, so we will link them at the Lexeme level.. Why can only L-Identifiers be used? Senses (and Forms) are entities and have identifiers. They wouldn't have a wiki-page of their own, but that's not a problem. The intention is that it's possible for one Sense to have a statement referring directly to another Sense (of the same or a different Lexeme). Wiktionary is organized around homonyms while Wikipedia is organized around
Re: [Wikidata-l] Wikidata for Wiktionary
John, sorry, I guess I was too slow - as far as I understand you have now re-read the 13-08 proposal, which has made my last Email redundant. https://www.wikidata.org/w/index.php?title=Wikidata_talk:Wiktionary/Development/Proposals/2015-05diff=216035102oldid=216029531 I hope that the model is clear now. Thanks for your engagement! Denny On Sun, May 17, 2015 at 12:20 PM Denny Vrandečić vrande...@gmail.com wrote: Daniel's answer fits exactly with the proposal (which is unsurprising, because he reviewed and certainly influenced it). To make it clear again: the proposal on https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05 is a proposal for the tasks that need to be performed. Your questions are mostly about the data model, which was discussed earlier in the following proposal: https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2013-08 Since I am not sure which questions remain open, I will try to address them here again, on the risk of repeating what has been said before. Unfortunately you seem to not use the terminology as defined in the second proposal linked above, which makes the discussion unnecessarily harder than it could be. If you prefer another terminology, I would be happy if you link to a one pager describing it, so that we can effectively communicate. How do we go from a spelled form of a lexeme at Wiktionary and to an identifier on Wikidata? If with spelled form of a lexeme at Wiktionary you mean a Form as per the proposal, then the answer is: Forms have statements, and statements may point to Items, Forms, Senses, Lexemes, etc.. The exact properties to be used in these statements are up to the community. If with spelled form of a lexeme at Wiktionary you mean Lexeme as per the proposal, than the answer is: Lexems have statements, and statements may point to Items, Forms, Senses, Lexemes, etc. The exact properties to be used in these statements are up to the community. This is already stated in the second link above. And how do we go from one Sense to another synonym Sense? A Sense has a set of statements, and statements may point to other Senses. The exact properties used are up to the community. So a statement with the property 'synonym' stated on a Sense could point to another Sense. Do we use statements? Yes. But then only the L-identifiers can be used, so we will link them at the Lexeme level.. No. As the second link above says, Senses and Forms also have Statements. It is not only Lexemes that have Statements. Wiktionary is organized around homonyms while Wikipedia is organized around synonyms, especially across languages, and I think this difference creates some of the problems. Yes, that is why Tasks 1, 2, 9 and 10 in the proposal for the task breakdown, the first link above, deal with exactly this question. Since Gerard stated that his question was subsumed by the above list, I hope that his question is also answered? I am afraid that I could not write a new proposal which is significantly clearer than the current, but I can keep answering questions. But all the questions you have asked seem to be explicitly answered in the two links given above. Since I know you are smart, I am wondering what is not working in the communication right now. Did you miss the first link? Because without that it is indeed hard to fully understand the second link (but the first link is already given in the second link). So, please, keep asking questions. And everyone else too. I would like to continue improving the proposals based on your questions and suggestions. On Sat, May 16, 2015 at 3:46 PM John Erling Blad jeb...@gmail.com wrote: Your description is pretty far from whats in the proposal right now. The proposal is not clear at all, so I would say update it and resubmit if for a new discussion. On Sat, May 16, 2015 at 12:21 PM, Daniel Kinzler daniel.kinz...@wikimedia.de wrote: Am 15.05.2015 um 01:11 schrieb John Erling Blad: How do we go from a spelled form of a lexeme at Wiktionary and to an identifier on Wikidata? What do you mean by go to? And what do you mean by identifier on Wikidata - Items, Lexemes, Senses, or Forms? Generally, Wiktionary currently combines words with the same rendering from different languages on a single page. So a single Wiktionary page would correspond to several Lexeme entries on Wikidata, since Lexemes on wikidata would be split per language. I suppose a Lexeme-Entry could be linked back to the corresponding pages on the various Wiktionaries, but I don't really see the value of that, and sitelinks are currently not planned for Lexeme entries. It probably makes more sense for the Wiktionary pages to explicitly reference the Wikidata-Lexeme that corresponds to each language-section on the page. And how do we go from one Sense to another synonym Sense? Do we use statements? But then only
Re: [Wikidata-l] Wikidata for Wiktionary
I very much appreciate OmegaWiki - it has been a trailblazer for many of the ideas in Wikidata, and as you say, it is the granddaddy in many ways. OmegaWiki has been extensively looked into and the results from that have directly flown into the current proposal. The write up of that analysis can be found here: https://www.wikidata.org/wiki/Wikidata:Comparison_of_Projects_and_Proposals_for_Wiktionary On Fri, May 8, 2015 at 11:46 AM Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, Please do appreciate that OmegaWiki, originally WiktionaryZ, really wants to be considered in all this. It is the grand daddy of Wikidata and it does combine everything you would want as far as lexical data is concerned. Thanks, GerardM On 8 May 2015 at 18:18, Denny Vrandečić vrande...@gmail.com wrote: I very much agree with Lydia and Nemo that there should not be a separate Wikibase instance for Wiktionary data. Having a single community in a single project, and not having to vote for admins here and there, have two different watchlists, have documentation be repeated, policies being rediscussed, etc. sounds like a smart move. Also, the Item-data and the Lexical-data would be much tighter connected than with any other project, and queries should be able to seamlessly work between them. The only reason Commons is proposed to have its own instance is because the actual multimedia files are there, and the community caring about those files is there and should work in one place. If there was only a single Wiktionary project, it might also be worth to consider having the structured data there - but since there are more than 150 editions of Wiktionary, a centralized place makes more sense. And since we already have Wikidata for that, I don't see the advantage of splitting the potential communities. On Fri, May 8, 2015 at 8:35 AM Luca Martinelli martinellil...@gmail.com wrote: 2015-05-08 15:33 GMT+02:00 Federico Leva (Nemo) nemow...@gmail.com: +1. The Wikimedia community has been long able to think of all the Wikimedia projects as an organic whole. Software, on the other hand, too often forced innatural divisions. Wiktionary, Wikipedia, Commons and Wikiquote (to name the main cases) link to each other all the time in a constructive division of labour. It makes no sense to make connections between them harder. I start from here, since Nemo got the point IMHO: the fact that every project has its own scope doesn't imply that the whole of the community works on different scopes - we just decided to split up our duties among ourselves. But it's not just that. TL;DR: Wikidata and Wiktionary deal with the same things (concepts), therefore are best-suited for each other, given some needed adaptations. Structured Data and Structured Wikiquote deal with different things (objects), therefore are not to be considered good examples. Long version here: In theory, one might just agree that a separate instance of Wikibase might be the best solution for Wiktionary, but Structured Data and Structured Wikiquote are different from a theoretical Structured Wiktionary, because they respectively deal with images, quotes and words. Images and quotes are describable *objects*, as the Wiki* articles/pages are, and there are billions and billions of those objects out there. This is the main, if not just the only, reason why we *have* to put up a separate instance of Wikibase to deal with them: thinking that Wikidata might deal with such an infinite task is just nuts. Words, on the other hands, are describable *concepts*, not objects. They can be linked one another by relation, they have synonyms and opposites, they can be regrouped or separated, etcetera, which is exactly what we're currently doing with Wikidata items. I know, words are even more than images and quotes, so it would be even more nuts to think to deal with this just with Wikidata - but Wikidata is *already* structured for dealing with concepts, making it the best choice for integrating data from Wiktionary. In other words, Wikidata and Wiktionary both work with *concepts*, while all the other projects work with *objects*. From a more practical point of view, why should I have a Wikidata item about, say, present tense[1] *AND* a completely similar item on Structured Wiktionary? It's the same concept, why should I have it in two different-yet-linked databases, belonging to and maintained by the very same community? Why can't we work something out to keep all informations just in one database? This is why I think that setting up a separate Wikibase for Wiktionary might end up in doubling our efforts and splitting our communities, which is exactly the opposite of what we need to do (halving the efforts and doubling the community).[2] Sorry for the long post. :) [1] https://www.wikidata.org/wiki/Q192613 [2] Not sure if I have to remark this, but please, PLEASE, note this is just
Re: [Wikidata-l] Wikidata for Wiktionary
I am not sure I understand what you are saying. The lexical data in Wikidata does allow for statements on Lexemes and Forms, as the proposal states explicitly. On Thu, May 7, 2015 at 9:25 PM Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, Given the opposition to having statements on the level of the label, it does not make sense to have Wiktionary included in Wikidata. Thanks, GerardM On 8 May 2015 at 06:19, Denny Vrandečić vrande...@gmail.com wrote: I would disagree with requiring the Wiktionary communities to change their ways. Instead we should adapt our plans to fit into the way they are set up. Even if the English Wiktionary community would change to have per-language pages instead of the current system, it would be rather unlikely that all other language editions of Wiktionary would follow in a timely manner. I would prefer to leave this decision to the autonomy of the projects, and instead adapt to them (which is, by the way, what the proposal does). Yair, as Daniel said, the current Wiktionary pages would not be mapped to Q-Items. Since this was unclear, I tried to update the text to make it clearer. Let me know if it is still confusing. I do not think a separate Wikibase instance would be needed to provide the data for Wiktionary. I think this can and should be done on Wikidata. But as said by Milos and pointed out by Gerard, lexical knowledge does indeed require a different data schema. This is why the proposal introduces new entity types for lexemes, forms, and senses. The data model is mostly based on lexical ontologies that we surveyed, like LEMON and others. On Thu, May 7, 2015 at 2:26 PM Federico Leva (Nemo) nemow...@gmail.com wrote: Andy Mabbett, 07/05/2015 22:53: The Wiktionary communities tend to strongly disagree that splitting entries per language would be easier for either editors or readers. How many languages are currently used? How will this scale to ~300 languages? Hm? Last time I counted, the English Wiktionary alone used way more than 300 languages. Nemo ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata for Wiktionary
I mean, the lexical data in Wikidata according to the proposal would allow for statements on Lexemes and Forms. I slipped into the future for a moment ;) On Thu, May 7, 2015 at 9:32 PM Denny Vrandečić vrande...@gmail.com wrote: I am not sure I understand what you are saying. The lexical data in Wikidata does allow for statements on Lexemes and Forms, as the proposal states explicitly. On Thu, May 7, 2015 at 9:25 PM Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, Given the opposition to having statements on the level of the label, it does not make sense to have Wiktionary included in Wikidata. Thanks, GerardM On 8 May 2015 at 06:19, Denny Vrandečić vrande...@gmail.com wrote: I would disagree with requiring the Wiktionary communities to change their ways. Instead we should adapt our plans to fit into the way they are set up. Even if the English Wiktionary community would change to have per-language pages instead of the current system, it would be rather unlikely that all other language editions of Wiktionary would follow in a timely manner. I would prefer to leave this decision to the autonomy of the projects, and instead adapt to them (which is, by the way, what the proposal does). Yair, as Daniel said, the current Wiktionary pages would not be mapped to Q-Items. Since this was unclear, I tried to update the text to make it clearer. Let me know if it is still confusing. I do not think a separate Wikibase instance would be needed to provide the data for Wiktionary. I think this can and should be done on Wikidata. But as said by Milos and pointed out by Gerard, lexical knowledge does indeed require a different data schema. This is why the proposal introduces new entity types for lexemes, forms, and senses. The data model is mostly based on lexical ontologies that we surveyed, like LEMON and others. On Thu, May 7, 2015 at 2:26 PM Federico Leva (Nemo) nemow...@gmail.com wrote: Andy Mabbett, 07/05/2015 22:53: The Wiktionary communities tend to strongly disagree that splitting entries per language would be easier for either editors or readers. How many languages are currently used? How will this scale to ~300 languages? Hm? Last time I counted, the English Wiktionary alone used way more than 300 languages. Nemo ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata for Wiktionary
I would disagree with requiring the Wiktionary communities to change their ways. Instead we should adapt our plans to fit into the way they are set up. Even if the English Wiktionary community would change to have per-language pages instead of the current system, it would be rather unlikely that all other language editions of Wiktionary would follow in a timely manner. I would prefer to leave this decision to the autonomy of the projects, and instead adapt to them (which is, by the way, what the proposal does). Yair, as Daniel said, the current Wiktionary pages would not be mapped to Q-Items. Since this was unclear, I tried to update the text to make it clearer. Let me know if it is still confusing. I do not think a separate Wikibase instance would be needed to provide the data for Wiktionary. I think this can and should be done on Wikidata. But as said by Milos and pointed out by Gerard, lexical knowledge does indeed require a different data schema. This is why the proposal introduces new entity types for lexemes, forms, and senses. The data model is mostly based on lexical ontologies that we surveyed, like LEMON and others. On Thu, May 7, 2015 at 2:26 PM Federico Leva (Nemo) nemow...@gmail.com wrote: Andy Mabbett, 07/05/2015 22:53: The Wiktionary communities tend to strongly disagree that splitting entries per language would be easier for either editors or readers. How many languages are currently used? How will this scale to ~300 languages? Hm? Last time I counted, the English Wiktionary alone used way more than 300 languages. Nemo ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata for Wiktionary
The work on queries and arbitrary access is well on its way, and also the new UI is continually being developed and deployed. I don't think that it is too early to think and gather consensus on how the steps for Wiktionary could look like. I am certainly not proposing to stop the current work on queries, but merely to create realistic tasks for the Wiktionary phase of Wikidata. On Wed, May 6, 2015, 21:54 Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, Would it not make sense to FIRST finish a few things.. Like Commons and Query ? Thanks, GerardM On 7 May 2015 at 04:54, Denny Vrandečić vrande...@gmail.com wrote: It is rather clear that everyone wants Wikidata to also support Wiktionary, and there have been plenty of proposals in the last few years. I think that the latest proposals are sufficiently similar to go for the next step: a break down of the tasks needed to get this done. Currently, the idea of having Wikidata supporting Wiktionary is stalled because it is regarded as a large monolithic task, and as such it is hard to plan and commit to. I tried to come up with a task break-down, and discussed it with Lydia and Daniel, and now, as said in the last office hour, here it is for discussion and community input. https://www.wikidata.org/wiki/Wikidata:Wiktionary/Development/Proposals/2015-05 I think it would be really awesome if we would start moving in this direction. Wiktionary supported by Wikidata could quickly become one of the crucial pieces of infrastructure for the Web as a whole, but in particular for Wikipedia and its future development. Cheers, Denny ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] novalue in qualifiers or references
Actually I think that having no value for the end date qualifier probably means that it has not ended yet. There is no other way to express whether this information is currently merely incomplete (i.e. it has ended, but no one bothered to fill it in) or not (i.e. it has not ended yet). This is pretty much the same use case as for normal claims. Other qualifiers I could imagine where an explicit no value would make sense is P678, I guess. In references it might make sense to state explicitly that the source does not have an issue number or an ISSN, etc., in order for example to allow cleanup of references and to mark the cases where a reference does not have a given value from those cases where it is merely incomplete. I don't have superstrong arguments as you see (I would have much stronger arguments for unknown value), but I would prefer not to forbid no value in those cases explicitly, because it might be useful and it is already there. [1] https://www.wikidata.org/wiki/Special:WhatLinksHere/Q18615010 On Thu, Apr 23, 2015 at 1:27 PM, Stas Malyshev smalys...@wikimedia.org wrote: Hi! I was lately looking into the use of novalue in wikidata, specifically in qualifiers and references. While use of novalue in property values is pretty clear for me, not sure it is as useful in qualifiers and refs. Example: https://www.wikidata.org/wiki/Q62#P6 As we can see, Edwin Mah Lee is the mayor of San Francisco, with end date set to novalue. I wonder how useful is this - most entries like this just omit end date, and if we query this in SPARQL, for example, we would do something like FILTER NOT EXISTS (?statement q:P582 ?enddate). Inconsistently having novalues there makes it harder to process both visually (instead of just looking for one having no end date we need to look for either no end date or end date with specific novalue) and automatically. And in overwhelming majority of cases I feel novalue and absence of value model exactly the same fact - it is a current event, etc. Is there any useful case for using novalue there? Another example: https://www.wikidata.org/wiki/Q2866#P569 Here we have reference with stated in:no value. I don't think I understand what it means - not stated anywhere? How would we know to make such claim? Is a lie? Why would we keep confirmed lies in the data? Does not have confirmed source that we know of? Many things do, why would we have stated in in this particular case? Summarily, it is unclear for me that novalue in references is ever useful. To quantify this, we do not have a lot of such things: on the partial dump I'm working with for WDQS (which contains at least half of the DB) there are 14 novalue refs and 13 properties using novalue as qualifier, leader being P582 with 200+ uses, and overall 422 uses. So volume-wise it's not a big deal but I'd like to figure out what's the right thing to do here and establish some guidelines. Thanks, -- Stas Malyshev smalys...@wikimedia.org ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] World's largest cities with a female mayor :-)
This is seriously awesome! Thank you! On Mon, Apr 20, 2015 at 1:18 PM Markus Krötzsch mar...@semantic-mediawiki.org wrote: Hi all, For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: What are the world's largest cities with a female mayor? The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles. I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at http://milenio.dcc.uchile.cl/sparql is as follows (with some explaining comments inline): PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier # Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population . # Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = en ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = en ) } } ORDER BY DESC(?population) LIMIT 100 To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press Run query. The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it. There might also be some inaccuracies in cases where a past mayor does not have an end date set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data. I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example. Cheers, Markus ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Initial release of the primary sources tool
I am happy to let you know about the initial release of the primary sources tool. More info is available here: https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool The release is meant to facilitate your feedback. There are probably plenty of things that should be fixed before the tool gets widely used. Please report the issues: https://github.com/google/primarysources/issues Even better are pull requests! A huge shoutout to Sebastian Schaffert (backend) and Thomas Steiner (frontend) who worked on the tool in their 20% time. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] ViziData
Any time property or the birth date property, specifically? On Tue Feb 24 2015 at 10:58:09 AM Maximilian Klein isa...@gmail.com wrote: Next research question: {q | instance_of(i, q) and has_time_property(i) and has_geo_property(i)} In this case we know humans (q5) are things that have time properties and geo properties. What are all the types of things that have time and geo properties? Make a great day, Max Klein ‽ http://notconfusing.com/ On Mon, Feb 23, 2015 at 10:57 AM, Georg Wild georg.w...@mailbox.tu-dresden.de wrote: On 23.02.2015 19:26, Maximilian Klein wrote: Georg, Nice viz! In your example you show births and deaths, so is your dataset, {items with date of birth} \intersect {items with place of birth}? In general are you thinking about visualising items that have birth a geo-cordinate and a time-coordinate? Yes that is correct, or to be more precise, all items that are humans (instance of Q5) and have both, a time- and geo-coordinate. So if for example an item would have a birth date specified but no place of birth (or the other way around) it won't be included in the extracted births dataset. Oh, and something that I forgot to mention earlier: The application is quite performance intensive and is best viewed in Chromium (or Chrome) if possible. Make a great day, Max Klein ‽ http://notconfusing.com/ On Thu, Feb 19, 2015 at 8:59 AM, Lydia Pintscher lydia.pintsc...@wikimedia.de mailto:lydia.pintsc...@wikimedia.de wrote: On Thu, Feb 19, 2015 at 2:46 PM, Georg Wild georg.w...@mailbox.tu-dresden.de mailto:georg.w...@mailbox.tu-dresden.de wrote: Hello Wikidatans, I'd like to quickly introduce you ViziData [1], a data visualization app that I wrote as part of my bachelors thesis last year and will work on improving in the coming months. It displays the geographical and temporal location of events (currently only births and deaths of humans are available in the prototype). The data is extracted from Wikidata with Wikidata Toolkit. The tool means to show an interesting use of the data in Wikidata (especially larger amounts) and can also give an impression about the quality and completeness of the collected data on a larger scale. Planned improvements include: * other datasets to display * more efficient and useful timeline widget * embedded tile map for orientation * canvas rendering for performance * information about events (e.g. listing persons who were born at selected point) * code quality :S The source is available on Github under the MIT license [2] and the corresponding paper can be read online [3] (in German only though). If you have any questions or concerns about this project, feel free to contact me :] Very nice, Georg! Looking forward to more data sets to display. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de http://www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985 tel:27%2F681%2F51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org mailto: Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- ☘ excellentiā excelsiōre ☘ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Call for development openness
Also, Gerard - you are one to quickly chide others for not being constructive in their criticism, and I very much appreciate you doing so. I would like to ask you to reconsider whether your contribution to this thread meets your own threshold for being constructive. Can we please stop being hurtful and dismissive of each other? We have a great project, riding an amazing wave, and there's too much for each one of us to do to afford to hurt each other and make this a place less nice than it could be. On Fri Feb 20 2015 at 1:44:53 PM Denny Vrandečić vrande...@google.com wrote: Regarding Paul's comment: I first heard about Wikidata at SemTech in San Francisco and I was told very directly that they were not interested in working with anybody who was experienced with putting data from generic database in front of users because they had worked so hard to get academic positions and get a grant from the Allen Institute and it is more cost-effective and more compatible with academic advancement to hire a bunch of young people who don't know anything but will follow orders. I am, frankly, baffled by this story. It very likely was me, presenting Wikidata at SemTech in SF, so it probably was me you have been talking with, but I have no recollection of a conversation going the way you describe it. If I remember the timing correctly, I didn't have an academic position at the time of SemTech. Actually, I gave up my academic position to move to Berlin and work on Wikidata. The donors on Wikidata never exercised any influence on the projects, beyond requiring reports on the progress. I cannot imagine that I would ever have said that we were not interested in working with anybody who was experienced with putting data from generic database in front of users, because, really, that would make no sense to say. I also do not remember having gotten an application from you. Regarding the team that we wanted and eventually did hire, I would sternly disagree with the description of a bunch of young people who don't know anything but will follow orders - from the applications we got we choose the most suitable team we could pull together. And considering the discussions we had in the following months, following orders was neither their strength nor the qualification they were chosen for. Nor did they consist only of young people. Instead, it turned out, they were exactly the kind of independent thinkers with dedication to the goal and quality that we were aiming for. Fortunately, for the project. Maybe the conversation went differently than you are remembering it. E.g. I would have insisted on building Wikidata on top of MediaWiki (for operational reasons). E.g. I would have insisted on everyone to work on Wikidata to move to Berlin (because I thought it would be the only possibility to get the project to an acceptable state in the original timeframe, so that we can ensure its future sustainability). E.g. I would have disagreed on being able to use RDF/SPARQL backends back then out of the box to be Wikidata's backend (but I would have been open for anyone showing me that I was wrong, and indeed very happy because, seriously, I have an unreasonable fondness for SPARQL and RDF). E.g. I would have disagreed that our job as Wikimedia is to spend too many resource in pretty frontends (because that is something the community can do, and as we see, is doing very well - I think Wikimedia should really concentrate on those pieces of work that cannot and are not being done by the community). E.g. I would have insisted on not outsourcing any major part of the development effort to an external service provider. E.g. it could be that we already had all positions filled, and simply no money for more people (really depends on the timing). So there are plenty of points we might have disagreed with, and which, maybe misunderstood, maybe subtly altered by the passage of time in a fallible memory, have lead to the recollection of our conversation that you presented, but, for the reasons mentioned above, I think that your recollection is incorrect. On Fri Feb 20 2015 at 12:42:44 PM Daniel Kinzler daniel.kinz...@wikimedia.de wrote: Hi Paul! I understand your frustration, but let me put a few things into perspective. For reference: I'm employed by WMDE and work on wikibase/wikidata. I have been working on MediaWiki since 2005, and am being payed for it since 2008. Am 20.02.2015 um 19:14 schrieb Paul Houle: I am not an academic. The people behind Wikidata are. To the extend that most of us have some college degree. The only full academic involved is Markus Krötzsch, who together with Denny Vrandecic developed many of the concepts behind Wikidata. He acts as an advisor to the Wikidata project, but doesn't have any formal position. Oh, we also have a group of students working on their bachelor project with us. I first heard about Wikidata
Re: [Wikidata-l] Call for development openness
Regarding Paul's comment: I first heard about Wikidata at SemTech in San Francisco and I was told very directly that they were not interested in working with anybody who was experienced with putting data from generic database in front of users because they had worked so hard to get academic positions and get a grant from the Allen Institute and it is more cost-effective and more compatible with academic advancement to hire a bunch of young people who don't know anything but will follow orders. I am, frankly, baffled by this story. It very likely was me, presenting Wikidata at SemTech in SF, so it probably was me you have been talking with, but I have no recollection of a conversation going the way you describe it. If I remember the timing correctly, I didn't have an academic position at the time of SemTech. Actually, I gave up my academic position to move to Berlin and work on Wikidata. The donors on Wikidata never exercised any influence on the projects, beyond requiring reports on the progress. I cannot imagine that I would ever have said that we were not interested in working with anybody who was experienced with putting data from generic database in front of users, because, really, that would make no sense to say. I also do not remember having gotten an application from you. Regarding the team that we wanted and eventually did hire, I would sternly disagree with the description of a bunch of young people who don't know anything but will follow orders - from the applications we got we choose the most suitable team we could pull together. And considering the discussions we had in the following months, following orders was neither their strength nor the qualification they were chosen for. Nor did they consist only of young people. Instead, it turned out, they were exactly the kind of independent thinkers with dedication to the goal and quality that we were aiming for. Fortunately, for the project. Maybe the conversation went differently than you are remembering it. E.g. I would have insisted on building Wikidata on top of MediaWiki (for operational reasons). E.g. I would have insisted on everyone to work on Wikidata to move to Berlin (because I thought it would be the only possibility to get the project to an acceptable state in the original timeframe, so that we can ensure its future sustainability). E.g. I would have disagreed on being able to use RDF/SPARQL backends back then out of the box to be Wikidata's backend (but I would have been open for anyone showing me that I was wrong, and indeed very happy because, seriously, I have an unreasonable fondness for SPARQL and RDF). E.g. I would have disagreed that our job as Wikimedia is to spend too many resource in pretty frontends (because that is something the community can do, and as we see, is doing very well - I think Wikimedia should really concentrate on those pieces of work that cannot and are not being done by the community). E.g. I would have insisted on not outsourcing any major part of the development effort to an external service provider. E.g. it could be that we already had all positions filled, and simply no money for more people (really depends on the timing). So there are plenty of points we might have disagreed with, and which, maybe misunderstood, maybe subtly altered by the passage of time in a fallible memory, have lead to the recollection of our conversation that you presented, but, for the reasons mentioned above, I think that your recollection is incorrect. On Fri Feb 20 2015 at 12:42:44 PM Daniel Kinzler daniel.kinz...@wikimedia.de wrote: Hi Paul! I understand your frustration, but let me put a few things into perspective. For reference: I'm employed by WMDE and work on wikibase/wikidata. I have been working on MediaWiki since 2005, and am being payed for it since 2008. Am 20.02.2015 um 19:14 schrieb Paul Houle: I am not an academic. The people behind Wikidata are. To the extend that most of us have some college degree. The only full academic involved is Markus Krötzsch, who together with Denny Vrandecic developed many of the concepts behind Wikidata. He acts as an advisor to the Wikidata project, but doesn't have any formal position. Oh, we also have a group of students working on their bachelor project with us. I first heard about Wikidata at SemTech in San Francisco and I was told very directly that they were not interested in working with anybody who was experienced with putting data from generic database in front of users because they had worked so hard to get academic positions and get a grant from the Allen Institute and it is more cost-effective and more compatible with academic advancement to hire a bunch of young people who don't know anything but will follow orders. Auch. Working with such people would be a drag. Luckily, we have an awesome team of full blooded programmers. Not that we get everything right, or done in time... RDF* and SPARQL* do not
Re: [Wikidata-l] Call for development openness
Also, the problem most SPARQL backend developers worried about was not Wikidata's size, but it's dynamicity. Not the number of triples, but the frequency of edits. And we did talk to many of those people. On Thu, Feb 19, 2015, 07:05 Markus Krötzsch mar...@semantic-mediawiki.org wrote: Hi Paul, Re RDF*/SPARQL*: could you send a link? Someone has really made an effort to find the least googleable terminology here ;-) Re relying on standards: I think this argument is missing the point. If you look at what developers in Wikidata are concerned with, it is +90% interface and internal data workflow. This would be exaclty the same no matter which data standard you would use. All the challenges of providing a usable UI and a stable API would remain the same, since a data encoding standard does not help with any of this. If you have followed some of the recent discussion on the DBpedia mailing list about the UIs they have there, you can see that Wikidata is already in a very good position in comparison when it comes to exposing data to humans (thanks to Magnus, of course ;-). RDF is great but there are many problems that it does not even try to solve (rightly so). These problems seem to be dominant in the Wikidata world right now. This said, we are in a great position to adopt new standards as they come along. I agree with you on the obvious relationships between Wikidata statements and the property graph model. We are well aware of this. Graph databases are considered for providing query solutions to Wikidata, and we are considering to set up a SPARQL endpoint for our existing RDF as well. Overall, I don't see a reason why we should not embrace all of these technologies as they suit our purpose, even if they were not available yet when Wikidata was first conceived. Re It is also exciting that vendors are getting on board with this and we are going to seeing some stuff that is crazy scalable (way past 10^12 facts on commodity hardware) very soon. [which vendors?] [citation needed] ;-) We would be very interested in learning about such technologies. After the recent end of Titan, the discussion of query answering backends is still ongoing. Cheers, Markus On 18.02.2015 21:25, Paul Houle wrote: What bugs me about it is that Wikidata has gone down the same road as Freebase and Neo4J in the sense of developing something ad-hoc that is not well understood. I understand the motivations that lead there, because there are requirements to meet that standards don't necessarily satisfy, plus Wikidata really is doing ambitious things in the sense of capturing provenance information. Perhaps it has come a little too late to help with Wikidata but it seems to me that RDF* and SPARQL* have a lot to offer for data wikis in that you can view data as plain ordinary RDF and query with SPARQL but you can also attach provenance and other metadata in a sane way with sweet syntax for writing it in Turtle or querying it in other ways. Another way of thinking about it is that RDF* is formalizing the property graph model which has always been ad hoc in products like Neo4J. I can say that knowing what the algebra is you are implementing helps a lot in getting the tools to work right. So you not only have SPARQL queries as a possibility but also languages like Gremlin and Cypher and this is all pretty exciting. It is also exciting that vendors are getting on board with this and we are going to seeing some stuff that is crazy scalable (way past 10^12 facts on commodity hardware) very soon. On Tue, Feb 17, 2015 at 12:20 PM, Jeroen De Dauw jeroended...@gmail.com mailto:jeroended...@gmail.com wrote: Hey, As Lydia mentioned, we obviously do not actively discourage outside contributions, and will gladly listen to suggestions on how we can do better. That being said, we are actively taking steps to make it easier for developers not already part of the community to start contributing. For instance, we created a website about our software itself [0], which lists the MediaWiki extensions and the different libraries [1] we created. For most of our libraries, you can just clone the code and run composer install. And then you're all set. You can make changes, run the tests and submit them back. Different workflow than what you as MediaWiki developer are used to perhaps, though quite a bit simpler. Furthermore, we've been quite progressive in adopting practices and tools from the wider PHP community. I definitely do not disagree with you that some things could, and should, be improved. Like you I'd like to see the Wikibase git repository and naming of the extensions be aligned more, since it indeed is confusing. Increased API stability, especially the JavaScript one, is something else on my wish-list, amongst a lot of other
Re: [Wikidata-l] Wikidata CACM article (Was: Conflict of Interest policy for Wikidata)
Yes, CC-BY is great. On Thu Jan 08 2015 at 7:01:12 AM Markus Krötzsch mar...@semantic-mediawiki.org wrote: On 08.01.2015 15:10, ja...@j1w.xyz wrote: Prior to viewing Markus Krötzsch's Wikidata page, I was unaware of the Wikidata: A Free Collaborative Knowledgebase article [1] written by Denny Vrandečić and Markus Krötzsch. This is a very helpful article that in my opinion should be featured on the Wikidata main page. Glad you liked it. Checking the Wikidata item, I notice that it is actually Open Access and not all rights reserved. It is available for free (forever) from the ACM [1], but it seems they do not define any license. However, as we have retained all the rights, we can do what we like there. Denny, shall we use CC-BY? Markus [1] http://cacm.acm.org/magazines/2014/10/178785-wikidata/fulltext [1] http://cacm.acm.org/magazines/2014/10/178785-wikidata/fulltext Regards, James Weaver On Wed, Jan 7, 2015, at 05:14 PM, Markus Krötzsch wrote: Irrespective of the general policy discussion, I have now been bold and changed my item and user page to record that relationship as by my earlier suggestion (as copied below): https://www.wikidata.org/wiki/Q18618630 I was wondering if, given that we have single signon, website account on should point to Wikidata or to Wikimedia or something else. But besides this minor point this seems to be a nice way to have COI declarations in the data (would also be interesting to know which living people have official Wikimedia accounts). Cheers, Markus On 07.01.2015 15:25, Markus Krötzsch wrote: ... In addition, there should be a template that one can use on one's user page to disclose that one is the person described in a certain item. Conversely, we should also use our website account on property (P553) to connect living people to their Wikidata user account, so the COI is recorded in the data. One could further disclose other COIs on one's user page in some standard format, but maybe with Wikidata we could actually derive such COIs automatically (your family members, the companies you founded, the university you graduated from, etc. can all be specified in data). Cheers, Markus ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Conflict of Interest policy for Wikidata
completely-self-servingYay! I would love to see it featured on the Wikidata main page! Let's slashdot ACM :)/completely-self-serving On Thu Jan 08 2015 at 6:11:57 AM ja...@j1w.xyz wrote: Prior to viewing Markus Krötzsch's Wikidata page, I was unaware of the Wikidata: A Free Collaborative Knowledgebase article [1] written by Denny Vrandečić and Markus Krötzsch. This is a very helpful article that in my opinion should be featured on the Wikidata main page. [1] http://cacm.acm.org/magazines/2014/10/178785-wikidata/fulltext Regards, James Weaver On Wed, Jan 7, 2015, at 05:14 PM, Markus Krötzsch wrote: Irrespective of the general policy discussion, I have now been bold and changed my item and user page to record that relationship as by my earlier suggestion (as copied below): https://www.wikidata.org/wiki/Q18618630 I was wondering if, given that we have single signon, website account on should point to Wikidata or to Wikimedia or something else. But besides this minor point this seems to be a nice way to have COI declarations in the data (would also be interesting to know which living people have official Wikimedia accounts). Cheers, Markus On 07.01.2015 15:25, Markus Krötzsch wrote: ... In addition, there should be a template that one can use on one's user page to disclose that one is the person described in a certain item. Conversely, we should also use our website account on property (P553) to connect living people to their Wikidata user account, so the COI is recorded in the data. One could further disclose other COIs on one's user page in some standard format, but maybe with Wikidata we could actually derive such COIs automatically (your family members, the companies you founded, the university you graduated from, etc. can all be specified in data). Cheers, Markus ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] [Mediawiki-api] Freebase like API with an OUTPUT feature ?
Actually, since Wikidata allows now properties on properties, one might easily create an item Disambiguating property and then make a claim instance of - Disambiguating property on the relevant property. there is no need for any extra implementation work. On Wed Jan 07 2015 at 9:48:32 AM Thad Guidry thadgui...@gmail.com wrote: Hi Lydia, It's more than that. I can get labels just fine with props=labels Ideally there were be a Number 3 a reconcile service, or an API that can be USED as a reconcile service. Given a search string of Paris, let's say... 1. Return some disambiguating properties and their labels and values. For reconciling purposes, you don't want to deal with codes like P12345 but instead a human understandable description of the property. a. Allow the output of the information returned to be expanded or reduced by some parameter values that I mentioned as OUTPUT. b. Allow the use of a (disambiguator) parameter to output only the disambiguating properties. (disambiguating properties are those that are most important when comparing A = B and given a type). In Freebase API, we had the option of this as shown here: http://freebase-search.freebaseapps.com/?query=Texasoutput=(disambiguator)limit=100scoring=entitylang=en The current disambiguator with Wikidata is actually the descriptions. Wikidata does not flag or mark properties like P856 (official site) as a disambiguating property, an important property. Freebase does however. It would be nice for Wikidata to begin work on having a disambiguating property flag (boolean Y/N) like Freebase does. The closest starting point for a Reconcile API with the current API structure that I can see is hacking a bit on this one: https://www.wikidata.org/w/api.php?action=wbgetentitiessites=enwikititles=Parislanguages=enprops=descriptions|claims Btw, that closest starting point, only outputs 1 entity for Paris in the enwiki... where's Paris, Texas ? Thad +ThadGuidry https://www.google.com/+ThadGuidry ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Conflict of Interest policy for Wikidata
I found out the other day that there's an item about myself, and I wanted to edit it, and got a weird feeling about it. So I raised the question on the project chat https://www.wikidata.org/wiki/Wikidata:Project_chat#COI_and_editing and got told that an RFC would be a good idea. So I tried one. I don't think it has caused problems yet, though - but it might be easier to discuss these things before they cause problems. https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Conflict_of_Interest Input is highly appreciated. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] How to declare a property is transitive, etc.
In OWL this is done through instance of (i.e. rdf:type) pointing at a Transitive Property class (owl:TransitiveProperty). So the most similar representation of that in Wikidata would be to have an item for transitive property, and make an instance of: transitive property statement on the respective property. Obvious caveat: for now this is just a syntactic marker, the system does not do anything special with it. But a SPARQL endpoint with an OWL-regime or an OWL reasoner could make the inferences if this statement is appropriately translated. Hope that helps, Denny On Thu Dec 18 2014 at 7:45:43 PM Emw emw.w...@gmail.com wrote: Hi all, Could those knowledgeable about OWL or intending to use Wikidata's RDF / OWL exports please weigh in at https://www.wikidata.org/wiki/ Wikidata:Property_proposal/Property_metadata#How_should_ we_declare_that_a_property_is_transitive ? [1] Being able to declare certain properties of properties is an essential building block for querying and inference. However, the way to declare that a property is, say, transitive in OWL does not have a clear analog in Wikidata syntax. We could certainly shoehorn such a statement into our existing model (and it looks like we'll need to), but it is important to do so in a way that complicate things as little as possible for downstream users, e.g. outside researchers or developers using the RDF exports and assuming standard OWL semantics. Please make any comments on this on-wiki at the location linked above. That way we can keep the discussion centralized. Other discussions on that page could also benefit from input by people knowledgeable about Semantic Web vocabulary. Thanks, Eric https://www.wikidata.org/wiki/User:Emw 1. Discussion permalink: https://www.wikidata.org/w/ index.php?title=Wikidata:Property_proposal/Property_ metadataoldid=182088235#How_should_we_declare_that_a_ property_is_transitive ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Tool for adding references and data to Wikidata
Hi Gerard, I very much agree. It would be very good to have a discussion on which kind of data can be integrated in which way. One way or the other, one of the most frequent criticisms of Wikidata is a lack of references, which this tool will tackle on the way as well. And at the same time it will allow for a human curation step, which I think is crucial for the Wikidata community to gain ownership of the data. Just dumping everything into Wikidata is, in my opinion, not a sustainable solution. But since the data will be released free, well, the community can decide to do it otherwise, obviously. Cheers, Denny On Wed Dec 17 2014 at 8:40:25 AM Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Wed, Dec 17, 2014 at 4:04 PM, Lane Rasberry l...@bluerasberry.com wrote: Hello, Where is the appropriate place on Wikidata to discuss this? This is big enough for its own WikiProject. Does it already have one somewhere? Should I make one? Actually I just did. https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase I hardly know what the implications are of this but it seems big enough to have a dedicated place for discussion. Thanks to Denny for whatever role you had in getting access to well-developed data collected by another project. I do not understand that is happening here but it seems like really good news, and I hope someone explains it more. Thanks for starting the project, Lane. Will you announce it on the Project chat? That way most people on-wiki will see it and can jump in. Once it has a bit more content we can announce it more widely. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] sneak peek time - checking against 3rd party databases
wohoo. that's pretty awesome! congrats. Are they going to use the soon-to-be-available property mapping properties? On Tue Dec 02 2014 at 1:33:38 PM Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: Hey folks :) The student team working on data quality and trust is hard at work and just showed me a first demo. I wanted to share that with you as well. One part of the team is working on checking Wikidata's data against other databases. Attached is a screenshot of their first demo showing checking of our data against MusicBrainz. In the end this will be nicely integrated on Wikidata as part of the constraint violation reports probably. This is already working way better than I expected it ever would. So heads off to the students. I'm sure they'll kick ass over the next months. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Various questions
On Tue Nov 11 2014 at 1:51:08 PM Denny Vrandečić vrande...@google.com wrote: +1 for removing the blacklist from the code. On Tue Nov 11 2014 at 12:28:05 AM John Erling Blad jeb...@gmail.com wrote: What did I say, etc, etc, etc... It feels good to be right. I was right. Me. I and myself. Some stuff always bites you, even if it was quite fun! ;) On Tue, Nov 11, 2014 at 9:09 AM, Jeroen De Dauw jeroended...@gmail.com wrote: Hey, I was looking through the configuration trying to debug my issues from my last email and noticed the list of blacklisted IDs. They appear to be numbers with special meaning. I was curious about two things, why are they blacklisted and what is the meaning of the remaining number? * 1: I imagine that this just refers to #1 * 23: Probably refers to the 23 enigma * 42: Life the universe and everything * 1337: leet * 9001: ISO 9001, which deals with quality assurance * 31337: Elite I guess we probably ought to delete those default values. They where added for something easter-egg like in the Wikidata project, and might well get in the way for third party users. This is also not the list of actual IDs that got blacklisted on Wikidata.org, which was a bit more extensive, and for instance had Q2013, the year in which Wikidata launched. I submitted a removal of these blacklisted IDs from the default config in https://gerrit.wikimedia.org/r/#/c/172504/ The only number that left me lost was 720101010. I couldn't figure this one out. 720101010 is 1337 for trolololo :) Cheers -- Jeroen De Dauw - http://www.bn2vs.com Software craftsmanship advocate Evil software architect at Wikimedia Germany ~=[,,_,,]:3 ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Various questions
On Tue Nov 11 2014 at 1:51:32 PM Denny Vrandečić vrande...@google.com wrote: On Tue Nov 11 2014 at 1:51:08 PM Denny Vrandečić vrande...@google.com wrote: +1 for removing the blacklist from the code. On Tue Nov 11 2014 at 12:28:05 AM John Erling Blad jeb...@gmail.com wrote: What did I say, etc, etc, etc... It feels good to be right. I was right. Me. I and myself. Some stuff always bites you, even if it was quite fun! ;) On Tue, Nov 11, 2014 at 9:09 AM, Jeroen De Dauw jeroended...@gmail.com wrote: Hey, I was looking through the configuration trying to debug my issues from my last email and noticed the list of blacklisted IDs. They appear to be numbers with special meaning. I was curious about two things, why are they blacklisted and what is the meaning of the remaining number? * 1: I imagine that this just refers to #1 * 23: Probably refers to the 23 enigma * 42: Life the universe and everything * 1337: leet * 9001: ISO 9001, which deals with quality assurance * 31337: Elite I guess we probably ought to delete those default values. They where added for something easter-egg like in the Wikidata project, and might well get in the way for third party users. This is also not the list of actual IDs that got blacklisted on Wikidata.org, which was a bit more extensive, and for instance had Q2013, the year in which Wikidata launched. I submitted a removal of these blacklisted IDs from the default config in https://gerrit.wikimedia.org/r/#/c/172504/ The only number that left me lost was 720101010. I couldn't figure this one out. 720101010 is 1337 for trolololo :) Cheers -- Jeroen De Dauw - http://www.bn2vs.com Software craftsmanship advocate Evil software architect at Wikimedia Germany ~=[,,_,,]:3 ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Birthday gift: Missing Wikipedia links (was Re: Wikidata turns two!)
Folks, as you know, many Googlers are huge fans of Wikipedia. So here’s a little gift for Wikidata’s second birthday. Some of my smart colleagues at Google have run a few heuristics and algorithms in order to discover Wikipedia articles in different languages about the same topic which are missing language links between the articles. The results contain more than 35,000 missing links with a high confidence according to these algorithms. We estimate a precision of about 92+% (i.e. we assume that less than 8% of those are wrong, based on our evaluation). The dataset covers 60 Wikipedia language editions. Here are the missing links, available for download from the WMF labs servers: https://tools.wmflabs.org/yichengtry/merge_candidate.20141028.csv The data is published under CC-0. What can you do with the data? Since it is CC-0, you can do anything you want, obviously, but here are a few suggestions: There’s a small tool on WMF labs that you can use to verify the links (it displays the articles side by side from a language pair you select, and then you can confirm or contradict the merge): https://tools.wmflabs.org/yichengtry The tool does not do the change in Wikidata itself, though (we thought it would be too invasive if we did that). Instead, the results of the human evaluation are saved on WMF labs. You are welcome to take the tool and extend it with the possibility to upload the change directly on Wikidata, if you so wish, or, once the data is verified, to upload the results. Also, Magnus Manske is already busy uploading the data to the Wikidata game, so you can very soon also play the merge game on the data directly. He is also creating the missing items on Wikidata. Thanks Magnus for a very pleasant cooperation! I want to call out to my colleagues at Google who created the dataset - Jiang Bian and Si Li - and to Yicheng Huang, the intern who developed the tool on labs. I hope that this small data release can help a little with further improving the quality of Wikidata and Wikipedia! Thank you all, you are awesome! Cheers, Denny On Wed Oct 29 2014 at 10:52:05 AM Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: Hey folks :) Today Wikidata is turning two. It amazes me what we've achieved in just 2 years. We've built an incredible project that is set out to change the world. Thank you everyone who has been a part of this so far. We've put together some notes and opinions. And there are presents as well! Check them out and leave your birthday wishes: https://www.wikidata.org/wiki/Wikidata:Second_Birthday Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Birthday gift: Missing Wikipedia links (was Re: Wikidata turns two!)
Sure, you can keep all your todos with Google ;) https://www.gmail.com/mail/help/tasks/ Cheers, Denny On Wed Oct 29 2014 at 2:58:03 PM Jeroen De Dauw jeroended...@gmail.com wrote: Hey, Does this mean we can also shoot a TODO list in the direction of Google? :) Cheers -- Jeroen De Dauw - http://www.bn2vs.com Software craftsmanship advocate Evil software architect at Wikimedia Germany ~=[,,_,,]:3 ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Super Lachaise, a mobile app based on Wikidata
That's a great idea! Just curious, for such a specific use case, why did you go for an App instead of a Website? On Tue Oct 28 2014 at 7:29:22 AM Sjoerd de Bruin sjoerddebr...@me.com wrote: Not available in the Dutch iTunes Store... Op 28 okt. 2014 om 15:26 heeft Pierre-Yves Beaudouin pierre.beaudo...@gmail.com het volgende geschreven: I'm happy to announce the release of Super Lachaise on the App Store. It's a free mobile app that help you during the visit of the Père Lachaise cemetery. This is probably one of the firsts mobile apps to use Wikidata ;) http://www.superlachaise.fr/ https://itunes.apple.com/fr/app/super-lachaise/id918263934 Pyb ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Open Data Awards
Yay! Congratulations! On Mon Oct 27 2014 at 4:55:51 PM John Lewis johnflewi...@gmail.com wrote: Hi everyone, Some exciting news here. The Open Data Awards' finalists lists were recently published on their website. Wikidata has been listed as a finalist in two different categories which are the Open Data Innovation Award and the Open Data Publisher Award. Lydia http://www.wikidata.org/wiki/User:Lydia_Pintscher_(WMDE) and Magnus http://www.wikidata.org/wiki/User:Magnus_Manske will be representing Wikidata at the gala dinner where the winner of each category will be announced live. I will be standing in as a backup should Lydia be unable to attend the award dinner but let's wish Lydia and Magnus a good time and keep our fingers crossed that Wikidata will win at least one of the two categories we've been nominated for. As Lydia would say - the entire community is awesome for working to help build Wikidata to where it is and this is as much as all of our work as it is the development team's for helping build and innovate the way free knowledge is shared within the mission of the Wikimedia Foundation. Thanks, John Lewis -- John Lewis ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] all human genes are now wikidata items
Wow! That's pretty cool work! Do you have any plans to keep the data fresh? On Mon Oct 06 2014 at 1:22:12 PM Benjamin Good ben.mcgee.g...@gmail.com wrote: I thought folks might like to know that every human gene (according to the United States National Center for Biotechnology Information) now has a representative entity on wikidata. I hope that these are the seeds for some amazing applications in biology and medicine. Well done Andra and ProteinBoxBot ! For example: Here is one (of approximately 40,000) called spinocerebellar ataxia 37 https://www.wikidata.org/wiki/Q18081265 -Ben ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] How can I increase the throughput of ProteinBoxBot?
That's very cool! To get an idea, how big is your dataset? On Tue Sep 30 2014 at 12:06:56 PM Daniel Kinzler daniel.kinz...@wikimedia.de wrote: What makes it so slow? Note that you can use wbeditentity to perform complex edits with a single api call. It's not as streight forward to use as, say, wbaddclaim, but much more powerfull and efficient. -- daniel Am 30.09.2014 19:00, schrieb Andra Waagmeester: Hi All, I have joined the development team of the ProteinBoxBot (https://www.wikidata.org/wiki/User:ProteinBoxBot) . Our goal is to make Wikidata the canonical resource for referencing and translating identifiers for genes and proteins from different species. Currently adding all genes from the human genome and their related identifiers to Wikidata takes more then a month to complete. With the objective to add other species, as well as having frequent updates for each of the genomes, it would be convenient if we could increase this throughput. Would it be accepted if we increase the throughput by running multiple instances of ProteinBoxBot in parallel. If so, what would be an accepted number of parallel instances of a bot to run? We can run multiple instances from different geographical locations if necessary. Kind regards, Andra ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Daniel Kinzler Senior Software Developer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Item both subclass and instance?
Fully agree with Markus' beautifully written explanation, although I am not completely convinced of the level theory - but it seems to work in the given examples, and a few other examples I was thinking through. Note that Porsche 356 could very much be an instance of car model - but not of car. All the rules that Markus has mentioned would stay intact in this case. We often don't make the difference between car and car model in our day to day speech, which is a common source of confusion (i.e. the Porsche 356 is a beautiful car vs the Porsche 356 is a beautiful car model - both would be acceptable in natural language, but alas, not in Wikidata). On Thu, Sep 25, 2014 at 3:53 PM, Markus Krötzsch mar...@semantic-mediawiki.org wrote: Hi, I fully agree with Thomas and the other replies given here. Let me give some other views on these topics (partly overlapping with what was said before). It's important to understand these things to get the subclass of/instance of thing right -- and it would be extremely useful if we could get this right in our data :-) What is a class and what is an item is often a matter of perspective, and it is certainly accepted in the ontology modelling community that one thing may need to be both. The important thing is that subclass of is a relation between *similar* things (usually of the same type): * sports car subclass of car * Porsche Carrera subclass of sports car * Porsche 356 subclass of Porsche Carrera Use A subclass of B if it makes sense to say all A's are also B's as in all Porsche Carreras are sports cars. In contrast, instance of is between things that are very *different* in nature: * Douglas Adams instance of human * human instance of species Subclass naturally forms chains, like in my example. You can leave out some part of the chain and the result is still meaningful: * Porsche Carrera subclass of car [makes sense] For instance of, this does not work: * Douglas Adams instance of species [bogus] So if you want to organise things in a hierarchy (specific to general), then you need subclass of. If you just describe the type of one thing, then you need instance of. It is perfectly possible that one thing participates in both types of relationships. In addition to these general guidelines, I would say that a well-modelled ontology should be organised in levels: whenever you use instance of, you go to a higher level; if you use subclass of, you stay on your current level. Each thing should belong to only one level. Here is an example where this is violated: * Porsche Carrera subclass of sports car * Porsche 356 subclass of Porsche Carrera * Porsche 356 instance of sports car Each of these makes sense individually, but the combination is weird. We should make up our mind if we want to treat Porsche 356 as a class (on the same level as sports car) or as an instance (on a lower level than sports car), but not do both at the same time. I think subclass of usually should be preferred in such a case (because if it is possible to use subclass of, then it is usually also quite likely that more specific items occur later [Porsche 356 v1 or whatever], and we really will need subclass of to build a hierarchy then). Cheers, Markus On 25.09.2014 20:10, Thomas Douillard wrote: Hi, this is a long discussion :) Is is allowed by OWL2 notion called Punning. The rationale is that Hydrogen is a chemical elements, and that the chemical element is not a subclass of atom. Rather a chemical elements is a type of atom, so chemical elements is a metaclass : a class of class of atoms. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] policy toward using non-CC0 licensed external databases a
On Sep 13, 2014 3:20 PM, P. Blissenbach pu...@web.de wrote: Regarding purely factual data comprising a less than significant portion of a database - which is certainly true for all ISBNs in Googles databas Btw. if a statement about an ISBN is sourced, among ohers, with Source: Google, that does not imply having it from Google. It only states the fact: Google has it, too. Purodha That's also why it is actually called reference and not source. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Language mappings between different wikipedia pages
Hey Marieke, You can either use the Wikidata toolkit by Markus Krötzsch, if you want to work on the dump, or the Wikidata web API, if you only need a few such mappings at a time. On Jul 17, 2014 9:24 AM, Erp, M.G.J. van marieke.van@vu.nl wrote: Hi there, I was wondering how to get the language mappings between different wikipedia pages. This information seems to be available on Wikidata as I can find it through browsing different pages on Wikidata such as http://www.wikidata.org/wiki/Q213710 and the https://www.mediawiki.org/wiki/Manual:Langlinks_table mentions a langlinks table, but I can't figure out how to get a dump. The Wiki interlanguage link records at http://dumps.wikimedia.org/wikidatawiki/20140705/ looked promising but that seems to contain user information if I'm not mistaken. For example, select count(*), ll_title from langlinks group by 2 order by 1 desc limit 20;” results in: +--+--+ | count(*) | ll_title | +--+--+ | 284 | User:تفکر| | 272 | user:OffsBlink | | 215 | User:YourEyesOnly| | 179 | User:MoiraMoira | | 65 | User:AvocatoBot | | 35 | User:Shikai shaw | | 35 | user:Shuaib-bot | | 33 | user:לערי ריינהארט | | 33 | User:Leyo| | 27 | user:Лобачев Владимир| | 20 | User:Wagino 20100516 | | 18 | user:Gangleri| | 17 | user:I18n| | 16 | user:Meursault2004 | | 12 | User:Labant | | 11 | User:Stryn | | 11 | User:angelia2041 | | 10 | user:Kelvin | | 10 | User:JCIV| |9 | Template:Mbox| +--+———+ I checked out the #mediawiki IRC channel someone recommended the Interwiki link tracking records but those seem to also contain al sorts of other links, and I don't see a way to filter out the in other languages links. It would be great if you could help me out. Thanks! Marieke van Erp -- Computational Lexicology Terminology Lab (CLTL) The Network Institute, VU University Amsterdam De Boelelaan 1105 1081 HV Amsterdam, The Netherlands http://www.mariekevanerp.com http://www.newsreader-project.eu ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata Toolkit 0.1.0 released
Hi Markus, On Wed Apr 09 2014 at 4:18:50 AM, Markus Krötzsch mar...@semantic-mediawiki.org wrote: Change to the directory of the example module (wdtk-examples), then run: mvn exec:java -Dexec.mainClass=org.wikidata.wdtk.examples.DumpProcessingExample Thanks, that is exactly what I needed! :) I understand that WDTK is a library to be used in your own applications, but I am often not patient enough to actually go and code up a whole app myself in a new dev environment before I actually see that the thing is running. So being able to actually start and run the example application is superuseful for my motivation, because now I can go ahead and tinker with it while it is running, and iteratively change it to what I want. Thanks again for the prompt and useful answer! It works like a charm now! Cheers, Denny ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Wikidata Toolkit 0.1.0 released
I was trying to use this, but my Java is a bit rusty. How do I run the DumpProcessingExample? I did the following steps: git clone https://github.com/Wikidata/Wikidata-Toolkit cd Wikidata-Toolkit mvn install mvn test Now, how do I start DumpProcessingExample? Sorry for being a bit dense here. Cheers, Denny On Mon Mar 31 2014 at 6:47:21 AM, Markus Krötzsch mar...@semantic-mediawiki.org wrote: Dear all, I am happy to announce the very first release of Wikidata Toolkit [1], the Java library for programming with Wikidata and Wikibase. This initial release can download and parse Wikidata dump files for you, so as to process all Wikidata content in a streaming fashion. An example program is provided [2]. The libary can also be used with MediaWiki dumps generated by other Wikibase installations (if you happen to work in EAGLE ;-). Maven users can get the library directly from Maven Central (see [1]); this is the preferred method of installation. There is also an all-in-one JAR at github [3] and of course the sources [4]. Version 0.1.0 is of course alpha, but the code that we have is already well-tested and well-documented. Improvements that are planned for the next release include: * Faster and more robust loading of Wikibase dumps * Support for various serialization formats, such as JSON and RDF * Initial support for Wikibase API access Nevertheless, you can already give it a try now. In later releases, it is also planned to support more advanced processing after loading, especially for storing and querying the data. Feedback is welcome. Developers are also invited to contribute via github. Cheers, Markus [1] https://www.mediawiki.org/wiki/Wikidata_Toolkit [2] https://github.com/Wikidata/Wikidata-Toolkit/blob/v0.1.0/ wdtk-examples/src/main/java/org/wikidata/wdtk/examples/ DumpProcessingExample.java [3] https://github.com/Wikidata/Wikidata-Toolkit/releases (you'll also need to install the third party dependencies manually when using this) [4] https://github.com/Wikidata/Wikidata-Toolkit/ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] qLabel
That's a toughie. Looking forward to see that one resolved :) On Wed, Apr 2, 2014 at 2:14 AM, Andy Mabbett a...@pigsonthewing.org.ukwrote: On 1 April 2014 20:01, Denny Vrandečić vrande...@google.com wrote: a bug on the github project I've raised another, about the use of adjectives and adverbs: https://github.com/googleknowledge/qlabel/issues/2 -- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] qLabel
Yes. That is why qLabel has a mechanism to implement your own loaders. The Wikidata and Freebase loaders are much more efficient than the generic RDF loader. Using SPARQL, RDF loading can be made more effective, but the LOD protocols break down with regards to that. Such a basic thing like labels on the Web of Data is a highly unsolved problem, and there is plenty of space for improvement. I hope that qLabel will incite the small number of changes required to change this situation. Cheers, Denny On Wed, Apr 2, 2014 at 8:44 AM, Paul Houle ontolo...@gmail.com wrote: I've been thinking about this kind of problem in my own systems. Name and link generation from entities is a cross-cutting concern that's best separated from other queries in your application. With SPARQL and multiple languages each with multiple rdf:label it is awkward to write queries that bring labels back with identifiers, particularly if you want to apply rules that amount if an ?lang label doesn't exist for a topic, show a label from a language that uses that uses the same alphabet as ?lang in preference to any others. Another issue too is that the design and business people might have some desire for certain kinds of labels and it's good to be able to change that without changing your queries. Anyway, a lot of people live on the other end of internet connections with 50ms, 2000ms or more latency to the network core, plus sometimes the network has a really bad day or even a bad few seconds. For every hundred or so TCP packets you send across the modern internet, you lose one. The fewer packets you send per interaction the less likely the user is going to experience this. If 20 names are looked up sequentially and somebody is on 3G cellular with 300ms latency, the user needs to wait six seconds for this data to load on top of the actual time moving the data and waiting for the server to get out of it's own way. This is using jQuery so it's very likely the page has other Javascript geegaws in that work OK for the developer who lives in Kansas City but ordinary folks in Peoria might not have the patience to wait until your page is fully loaded. Batch queries give users performance they can feel, even if they demand more of your server. In my system I am looking at having a name lookup server that is stupidly simple and looks up precomputed names in a key value store, everything really stripped down and efficient with no factors of two left on the floor. I'm looking at putting a pretty ordinary servlet that writes HTML in front of it, but a key thing is that the front of the back end runs queries in parallel to fight latency, which is the scourge of our times. (It's the difference between Github and Altassian) On Wed, Apr 2, 2014 at 4:36 AM, Daniel Kinzler daniel.kinz...@wikimedia.de wrote: Hey Denny! Awesome tool! It's so awesome, we are already wondering about how to handle the load this may generate. As far as I can see, qlabel uses the wbgetentities API module. This has the advantage of allowing the labels for all relevant entities to be fetched with a single query, but it has the disadvantage of not being cacheable. If qlabel used the .../entity/Q12345.json URLs to get entity data, that would be covered by the web caches (squid/varnish). But it would mean one request per entity, and would also return the full entity data, not just the labels in one language. So, a lot more traffic. If this becomes big, we should probably offer a dedicated web interface for fetching labels of many entities in a given language, using nice, cacheable URLs. This would mean a new cache entry per language per combination of entities - potentially, a large number. However, the combination of entities requested is determiend by the page being localized - that is, all visitors of a given page in a given language would hit the same cache entry. That seems workable. Anyway, we are not there quite yet, just something to ponder :) -- daniel Am 01.04.2014 20:14, schrieb Denny Vrandečić: I just published qLabel, an Open Source jQuery plugin that allows to annotate HTML elements with Wikidata Q-IDs (or Freebase IDs, or, technically, with any other Semantic Web / Linked Data URI), and then grabs the labels and displays them in the selected language of the user. Put differently, it allows for the easy creation of multilingual structured websites. And it is one more way in which Wikidata data can be used, by anyone. Contributors and users are more than welcome! http://google-opensource.blogspot.com/2014/04/qlabel-multilingual-content-without.html ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Daniel Kinzler Senior Software Developer Wikimedia Deutschland Gesellschaft
[Wikidata-l] Wikidata for organizing 1000s of extensions in mediawiki.org?
I would very strongly recommend to use Semantic MediaWiki for this use case. It is more powerful, we use SMW in other WMF contexts already, and supporting the data inside Meta (instead of inside Wikidata and then transcluding it) allows us also to generate workflows in Meta involving local User-accounts, etc., and reduces complexity since the data is saved in one place, and you don't have to switch between Meta and Wikidata to update the data for an extension. It also frees Wikidata for having have to extend their policies to support this specific use case (would MediaWiki extension developers all get their own item per Notability, etc.). Also, SMW already right now supports the uses cases you are asking for right now. I understand that SMW was already suggested in Bugzilla. I understand Wikidata looks more sexy right now, but I think it is not the most appropriate tool for this use case. Just my 2 cents. On Wed Mar 19 2014 at 10:09:34 PM, Quim Gil q...@wikimedia.org wrote: Organize MediaWiki's catalog of 1000s of extensions using Wikidata. Is this a sensible idea? Reality checks and other opinions are welcome here or at https://bugzilla.wikimedia.org/show_bug.cgi?id=46704#c33 Pasting the relevant part for convenience: Has anybody discussed the possibility of creating Wikidata items for extensions, after defining a set of properties to describe them? Linking those Wikidata items to mediawiki.org extension pages, and then playing with templates and what not to keep the semantic data up to date (version number, last release, dependencies, compatible with MediaWiki releases...)? Then play with templates, queries and visualizations to create all kinds of useful output, from structured extension pages to a proper and robust map of extensions. -- Quim Gil Engineering Community Manager @ Wikimedia Foundation http://www.mediawiki.org/wiki/User:Qgil ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Queries - can they be stored as statements in Category/List items?
Micru, thank you for the explanation. I understand better now what you mean. I still disagree - let me explain why. I think that trying to express a query definition into a single statement is very hard. Having a specific Query namespace allows us to create a completely new UI for them, allows us to use a different data model for Queries than for items, and allows us to treat Query pages very different (e.g. for caching) than, e.g. item pages. For example, the different data model would allow us to restrict the number of queries on a page. If they were just a statement, what would stop a contributor from creating several such statements on one page? What happens when someone removes the same as query statement? What happens if someone adds it to the page for USA (e.g. same as query instance of-country, continent-North America, population-300M)? Would this page suddenly be treated differently? Also, you already show in your mock up that the same as query statement requires plenty of special code (e.g. for the different visualizations, etc.) One option would be to have them as item pages, but then treat them continuously different. This would mean more and more exceptions and special casing in the code. I think that Queries and Items are sufficiently different to deserve their own treatment. In my personal opinion, this is provides sufficient reasons for Query pages and Item pages being distinct. What would be the advantage of having Queries being expressed in the Items? Less entities? Less confusion about what these list of- and category-items mean? Both reasons I don't find sufficiently enticing to change my opinion on this. Cheers, Denny On Fri Mar 07 2014 at 5:00:19 AM, David Cuenca dacu...@gmail.com wrote: Denny, sorry for the confusion, it is a complex topic, or it could also be that I am terribly bad at explaining :) Based on that item page I have made a mock-up which perhaps makes things easier: http://i.imgur.com/1dSfrqx.png The reasoning for this being: 1) there is a well-defined set of queries that are equivalent to categories/lists, so there is no need to have independent query pages 2) if a wikipedia wants to include query results on a page, it is quite probable that the query already exists as a list/category 3) and if it doesn't then it will be *very* specific to that language wikipedia. In that case there is no need to define a query page on wikidata, but on the wikipedia page itself as an inclusion syntax command or another similar module You are right that it might be a bit preliminary, as there are not even simple queries yet, but since this kind of decisions might have an impact on later design, I think it is worth start presenting the concepts/options now. Besides, ideas and a common understanding take time to develop, and the RFC was started, so I thought it was worth giving it some attention. Cheers, Micru On Fri, Mar 7, 2014 at 12:18 AM, Denny Vrandečić vrande...@gmail.comwrote: Since I am obviously bad at guessing what you mean, can you please explicate what you mean with replicate that functionality on Wikidata? Sorry, I am too dense to understand it. What do you want to happen, explicitly? I go to http://www.wikidata.org/wiki/Q6573995 - how should it be different from what it displays today? Do you want the item pages to have the feature to directly embed query results, instead of having a one-click distance to the actual query page and its results? Or is there more to it? On Thu Mar 06 2014 at 3:10:44 PM, David Cuenca dacu...@gmail.com wrote: I'm not saying that the results yielded by Category:Books by Jean-Paul Sartre or Category:Books by J.R.R. Tolkien are or should be the same as the result yielded by a corresponding Wikidata query, but the concepts they represent, they are the same. Ditto for lists. (As a further clarification, I didn't mention anything about changing Wikipedia categories or Wikipedia lists either.) My question was regarding the functionality of WD items associated with Wikipedia categories and Wikipedia lists. Conceptually those items represent (or can represent) queries. WDQ, the tool by Magnus, already can interpret certain statements as queries [1]. Would it make sense to replicate that functionality on Wikidata? Cheers, Micru [1] http://tools.wmflabs.org/reasonator/?q=6573995 On Thu, Mar 6, 2014 at 11:16 PM, Denny Vrandečić vrande...@gmail.comwrote: But that's simply not the case. The Category:Books by Jean-Paul Sartre [1] or Category:Books by J.R.R. Tolkien [2} neither are a complete list of books by those authors (e.g. Sartre's fictional books are missing, Tolkien's non-fictional *and* Middle earth books are missing), nor are they only including books by Tolkien (e.g. they also include templates and other categories, which are likely not written by Sartre or Tolkien). If the plan is to change the way categories are used in Wikipedia and the other Wikimedia wikis
Re: [Wikidata-l] rank related changes
Wikidata labels are simple. This is due to the necessities of the project. We need one single label to display. Having Wikidata labels with ranks, qualifieres, sources, etc. simply would not work in the UI. Labels and names in reality are indeed extremely complex. But as already pointed out, this kind of information can be expressed with Statements, and we already have properties to do so and will probably get more such properties when the multi- and monolingual text properties get developed. So, yes, Gerard, Daniel would be wrong if he would say that labels are simple in the world. But that is not what he said. He was simply referring to labels as they are already implemented in Wikidata, and that serve a very specific purpose - and for these, he is absolutely right to say that ranks do not apply for them. The only purpose of labels and descriptions is to provide identifying information and to provide something to display for an item. The only purpose for aliases is to increase recall for search. I would consider having an alias containing a frequent typo absolutely OK, if it helps people find that item. They don't have to be right. They don't have to be sourced. They have to be useful. Statements on the other hand contain the actual content of Wikidata. And those have ranks, qualifiers, sources, etc. Statemnt can contain historical names of cities, and say from when to when they were used. Queries can then some day use this information and display it within the context of a specific query. But that is not what Wikidata labels are there for. I hope that makes sense. On Fri Mar 07 2014 at 12:01:32 AM, Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, The name was Batavia at that time in any language. The issue is that when you fudge information in this way, you can not have proper queries. This is why Daniel is wrong and the notion that labels are simple needs to be revisited. It is not rare at all and it exists in many domains. This is why it is wrong, wrong, wrong. Thanks, GerardM On 6 March 2014 19:31, Joe Filceolaire filceola...@gmail.com wrote: Use 'Birth name (P513)' (string datatype) for Cassius Clay or 'Official name' (Proposed property with monolingual text datatype) for Batavia - with date qualifiers. Joe On Thu, Mar 6, 2014 at 4:12 PM, Gerard Meijssen gerard.meijs...@gmail.com wrote: Hoi, So how do I indicate that up to a particular date Jakarta was called Batavia ? Muhammed Ali was called Cassius Clay ? There is no discussion about it. All there is an (potentially perceived) inability to use appropriate labels at will. Labels are not simple. Thanks, Gerard On 6 March 2014 17:07, Daniel Kinzler daniel.kinz...@wikimedia.dewrote: Am 06.03.2014 16:27, schrieb Gerard Meijssen: Hoi, I hope this will be revisited. Many items change there name and dependent on a date they or it are called differently. If the name is something that is changed, debated, or otherwise a subject of discussion, create a statement using an appropriate property. The point of having labels is precisely that they are simple. -- daniel -- Daniel Kinzler Senior Software Developer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Status PropertySuggester
Welcome to Wikidata! I am very much looking forward to see the results of your work. The demo looks very promising and the results are already so much better than what we currently have. And also the answers are very fast, which is promising. Awesome work, and welcome! On Thu Jan 09 2014 at 8:42:45 AM, Weidhaas, Virginia virginia.weidh...@student.hpi.uni-potsdam.de wrote: Hi, first of all we would like to introduce ourselves. We are students from the wikidata.lib bachelor project at the chair of Prof. Dr. Naumann Information Systems. We will work on Wikidata from October 2013 until June 2014. Our mentor is Anja Jentzsch. Every member is a student of the Hasso-Plattner-Institute in their fifth semester of Bachelor. We aim to provide an extension for wikidata, that simplifies adding statements by suggesting properties which fit to the item. BugZilla Ticket we relate to: https://bugzilla.wikimedia.org/show_bug.cgi?id=46555 Our Project Documentation: https://github.com/Wikidata-lib/Wikidata.lib/wiki/Intelligent-Forms Status: At the moment we have API functionality to return property suggestions ranked by correlation with an item or a set of property ids. You can try it out yourself here: http://suggester.wmflabs.org/wiki/index.php/Spezial:PropertySuggester Alternatively you can use the underlying api module wbsgetsuggestions The next step will be to integrate that functionality with the entityselector input field when adding statements to some item. We are looking forward to your feedback and ideas. Moritz, Christian, Virginia, Felix ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] How are queries doing?
The main reason why Queries are not done yet is because in the beginning of 2013 I deprioritized them compared to the original plan. Only a single developer kept working on them, instead of a major part of the team, as was originally planned. I made this decision because it became clear to me that we will likely be able to continue the Wikidata development beyond the original 12-month plan (as was indeed the case) and that, in the medium run, rushing this functionality would only hurt the project. I thus decided to increase the priorities on tasks which had a higher short-term benefit and were more immediate, e.g. many smaller things, but also more datatypes, ranks, and clean-ups, but also reactions to the roll-outs which had begun back then. This made us highly responsive to the current needs of the community, and lead to a sustained growth of Wikidata. If it would be needed, queries could be rushed. But that would have a negative impact on the longer sustainability of the project. If it would be deemed a higher priority, the development of queries could be sped up. But this comes with a sacrifice regarding other functionalities. Thus yes, more resources would lead to a faster development of queries (if it were decided that this would be the appropriate priority). The latter especially means that a sustained contribution from external developers can also lead to a faster development of the query functionality. We have seen with the sustained support of Benestar for the Badges functionality that this is feasible and possible. So instead of simply expressing complaints about features not being developed fast enough, how about actually helping with making them real? it is Open Source after all. Or at least simply make a case for the importance of this functionality? The development team keeps listening to the community like no other that I know of, and prioritizes their effort with respect to that. So, in short, blame me. Cheers, Denny On Tue, Jan 7, 2014 at 2:08 PM, Jan Kučera kozuc...@gmail.com wrote: Hm, nice to read all the reasoning why queries are yet still not possible, but I think we live in 2014 not and not 1914 actually... seems like the problem is too small budget or bad management... can not really think of another reason. How much do you think would it cost to make queries reality for production at Wikidata? Regards, Jan 2013/11/29 Gerard Meijssen gerard.meijs...@gmail.com Hoi, Please understand that providing functionality like query is something that has to fit into a continuously live environment. This is an environment where the Wikidata functionality is used all the time and where some of the underlying functionality is changed as well. The Wikidata development is not happening in a vacuum. Given that we hope to get next week a new type of property, it should be obvious that Wikidata is not feature complete. When you add on extra functionality like a query engine, you add extra complications while the work is ongoing to get to the stage where Wikidata is feature complete for the data types. Another aspect is that it is NOT the Wikidata team to decide what goes into production on Wikipedia projects. The Ask 1.0 functionality for instance is at its release level. It is now for other people to determine if they want to include it in. They have their own road maps and, it is not obvious for an observer what the rationales are. NB Ask 1.0 is also used in Semantic MediaWiki and it provides a query kind of functionality. Query does require some performance grin and what is too much /grin. So in one aspect there is a query functionality to be used in Wikipedia ea. What the query functionality will deliver that is still being build is not clear to me. On another note, there are other projects that have lingering before they were implemented. Nothing new here. There have been other projects that had to change because of external pressures. Nothing new here. If you want query functionality on the existing data now, there is a hack that works quite nicely. It makes use of data replicated to the labs environment. The replication is broken and given the holidays it has not been picked up for the third day now so the data is three days old. Thanks, GerardM On 29 November 2013 14:03, Martynas Jusevičius marty...@graphity.orgwrote: Jan, my suspicion is that my predictions from last year hold true: it is a far more complex task to design a scalable and performant data model, query language and/or query engine solely for Wikidata than the designers of this project anticipated - unless they did anticipate and now knowingly fail to deliver. You can check some threads from december last year, and they relate to even older ones: http://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg01415.html Martynas On Fri, Nov 29, 2013 at 1:47 PM, Jan Kučera kozuc...@gmail.com wrote: Ok. One is a bit disappointed seeing
Re: [Wikidata-l] [Wikisource-l] DNB 11M bibliographic records as CC0
Thanks for reviving this thread, Luiz. I also wanted to ask whether we should be updating parts of DNB and similar data. Maybe not create new entries, but for those that we already have, add some of the available data and point to the DNB dataset? On Fri, Dec 6, 2013 at 3:24 PM, Luiz Augusto lugu...@gmail.com wrote: Just found this thread while browsing my email archives (I'm/was inactive on Wikimedia for at least 2 years) IMHO will be very helpfull if a central place hosting metadata from digitized works will be created. In my past experience, I've found lots of PD-old books from languages like french, spanish and english in repositories from Brazil and Portugal, with UI mostly in portuguese (ie, with very low probabilities to get found by volunteers from subdomains from those languages), for example. I particularly loves validating metadata more than proofreading books. Perhaps a tool/place like this makes new ways to contribute to Wikisource and helps on user retention (based on some wikipedians that gets fun making good articles but loves also sometimes to simply make trivial changes on their spare time)? I know that the thread was focused on general metadata from all kinds and ages of books, but I had this idea while reading this [[:m:User:555]] On Mon, Aug 26, 2013 at 10:42 AM, Thomas Douillard thomas.douill...@gmail.com wrote: I know, I started a discussion about porting the bot to WIkidata in scientific Journal Wikiproject. One answer I got : the bot owner had other things to do in his life than running the bot and was not around very often any more. Having everiyhing in Wikidata already will be a lot more reliable and lazier, no tool that works one day but not the other one, no effort to tell the newbies that they should go to another website, no significant problem. Maybe one opposition would be that the data would be vandalised easily, but maybe we should find a way to deal with imported sourced datas which have no real reason to be modified, just marked deprecated or updated by another import from the same source. 2013/8/26 David Cuenca dacu...@gmail.com If the problem is to automate bibliographic data importing, a solution is what you propose, to import everything. Another one is to have an import tool to automatically import the data for the item that needs it. In WP they do that, there is a tool to import book/journal info by ISBN/doi. The same can be done in WD. Micru On Mon, Aug 26, 2013 at 9:23 AM, Thomas Douillard thomas.douill...@gmail.com wrote: If Wikidata has an ambition to be a really reliable database, we should do eveything we can to make it easy for users to use any source they want. In this perspective, if we got datas with guaranted high quality, it make it easy for Wikidatian to find and use these references for users. Entering a reference in the database seems to me a highly fastidious, boring, and easily automated task. With that in mind, any reference that the user will not have to enter by hand is something good, and import high quality sources datas should pass every Wikidata community barriers easily. If there is no problem for the software to handle that many information, I say we really have no reason not to do the imports. Tom ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Etiamsi omnes, ego non ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] quantities datatype available for testing
It is either obvious that they should be entering only integers or positive numbers, in which case such feedback isn't helpful, or it might end up being too restrictive again. Who tells me that a system like this won't get used in order to force cities to have a population of an integer bigger than 10,000? I understand the wish and desire to restrict user input, but I would like to remind everyone that Wikidata comes from the wiki side, which adheres more to the 'let's gather input and then verify it' than the 'let's make everyone give us correct input in the first place' side. On Fri, Nov 22, 2013 at 11:24 AM, Helder . helder.w...@gmail.com wrote: On Fri, Nov 22, 2013 at 4:11 PM, Lukas Benedix bene...@zedat.fu-berlin.de wrote: The problem I see with this practice is that a user doesn't get any feedback that he is entering 'invalid' values. +1 Helder ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] quantities datatype available for testing
So instead better to limit your freedom to express yourself in the first place. I'd take the bot. At least in the history of the article it is recorded that it was tried to enter 123.45 for a population, and we can later figure out what was happening. Why not wait and see if this is really a problem? I wonder how many such mistakes will ever be entered, besides jokes and vandalism. And the latter is easier to catch if we don't require the pranksters to use data that sounds correct. Do we have any indication that contributors are being supported by a system that doesn't let them enter negative numbers for populations? On Fri, Nov 22, 2013 at 1:46 PM, Lukas Benedix bene...@zedat.fu-berlin.dewrote: I don't want to feel like John Connor... hunted by a bot that comes after my edits and reverts them only because I entered 123.45 for a property that should be an integer. Am Fr 22.11.2013 21:56, schrieb Denny Vrandečić: It is either obvious that they should be entering only integers or positive numbers, in which case such feedback isn't helpful, or it might end up being too restrictive again. Who tells me that a system like this won't get used in order to force cities to have a population of an integer bigger than 10,000? I understand the wish and desire to restrict user input, but I would like to remind everyone that Wikidata comes from the wiki side, which adheres more to the 'let's gather input and then verify it' than the 'let's make everyone give us correct input in the first place' side. On Fri, Nov 22, 2013 at 11:24 AM, Helder . helder.w...@gmail.com wrote: On Fri, Nov 22, 2013 at 4:11 PM, Lukas Benedix bene...@zedat.fu-berlin.de wrote: The problem I see with this practice is that a user doesn't get any feedback that he is entering 'invalid' values. +1 Helder ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Questions about statement qualifiers
Hello Antoine, just to add to what was already said: a Qualifier in Wikidata is not a statement about a statement. In RDF semantics, the pattern that we follow is not the reification of the triple and then to make triples with the reified triple as a subject, as per http://www.w3.org/TR/rdf-mt/#ReifAndCont but rather the pattern of n-ary relations per http://www.w3.org/TR/swbp-n-aryRelations/ . The use cases very beautifully visualize how Wikidata maps to RDF: http://www.w3.org/TR/swbp-n-aryRelations/#useCase1 This is also what Wikidata's mapping to RDF document explains and motivates: https://meta.wikimedia.org/wiki/Wikidata/Development/RDF I hope this helps, Denny On Oct 31, 2013 3:40 AM, Antoine Zimmermann antoine.zimmerm...@emse.fr wrote: Hello, I have a few questions about how statement qualifiers should be used. First, my understanding of qualifiers is that they define statements about statements. So, if I have the statement: Q17(Japan) P6(head of government) Q132345(Shinzō Abe) with the qualifier: P39(office held) Q274948(Prime Minister of Japan) it means that the statement holds an office, right? It seems to me that this is incorrect and that this qualifier should in fact be a statement about Shinzō Abe. Can you confirm this? Second, concerning temporal qualifiers: what does it mean that the start or end is no value? I can imagine two interpretations: 1. the statement is true forever (a person is a dead person from the moment of their death till the end of the universe) 2. (for end date) the statement is still true, we cannot predict when it's going to end. For me, case number 2 should rather be marked as unknown value rather than no value. But again, what does unknown value means in comparison to having no indicated value? Third, what if a statement is temporarily true (say, X held office from T1 to T2) then becomes false and become true again (like X held same office from T3 to T4 with T3 T2)? The situation exists for Q35171(Grover Cleveland) who has the following statement: Q35171 P39(position held) Q11696(President of the United States of America) with qualifiers, and a second occurrence of the same statement with different qualifiers. The wikidata user interface makes it clear that there are two occurrences of the statement with different qualifiers, but how does the wikidata data model allows me to distinguish between these two occurrences? How do I know that: P580(start date) March 4 1885 only applies to the first occurrence of the statement, while: P580(start date) March 4 1893 only applies to the second occurrence of the statement? I could have a heuristic that says if two start dates are given, then assume that they are the starting points of two disjoint intervales. But can I always guarantee this? Best, AZ -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] creator template, wikimedia commons
This is completely up to the community, whether they want this data and the necessary structures for it. It really depends on the scope of the dataset. But here it is the same: there is no way to use this data in short-term for the metadata in Commons. This will be possible in a few months, if you are willing to wait this long. Best way is probably to describe the dataset (in scope, i.e. what it covers, and depth, i.e. what it says about the covered items) on Wikidata or here, and see if there are opinions about adding this data. Cheers, Denny 2013/9/25 Antoine Isaac ais...@few.vu.nl Hello Denny, I think we in Europeana had the same problem in the GLAMwiki toolset project [1]. We wanted to submit the metadata we had for Europeana objects to be uploaded in Commons, but that was not fully possible... So we'd have to think of an alternative. Do you think it could happen via Wikidata? Best, Antoine [1] It is a bit early for that question right now. In the long run we plan to have metadata about Commons media file (current state of the discussion is here: https://commons.wikimedia.org/* *wiki/Commons:Wikidata_for_**media_infohttps://commons.wikimedia.org/wiki/Commons:Wikidata_for_media_info) - but this is planned for 2014. For now, the Creator template in Commons can not be replaced with Wikidata. There is no way to integrate data from Wikidata in an arbitrary Commons page (which is what you would need in order to replace the Creator template). We have a Bug for ( this https://bugzilla.wikimedia.** org/show_bug.cgi?id=47930https://bugzilla.wikimedia.org/show_bug.cgi?id=47930) and aim for this in this fall / early winter to be completed. My assumption would be to, for the long term, create the Creators as items in Wikidata (if they do not exist) and add data to them. But this will not yield any short-term visible results, which is frustrating. So you might want to have a double strategy: add them to Wikidata and add them in the creator namespace, just as you have done so far. Does anyone have another opinion on this? Sorry for not having better news right now, Denny 2013/9/21 rupert THURNER rupert.thur...@gmail.com mailto: rupert.thurner@gmail.**com rupert.thur...@gmail.com hi, we are currently experimenting to have, after zb zürich earlier the year[1], a second museum from switzerland uploading full quality images (i.e. tif format)[2]. i was wondering what is the most wikidata compatible way of adding a creator information to an image like this one: https://commons.wikimedia.org/**wiki/File:Soleure_Aa_0012.tifhttps://commons.wikimedia.org/wiki/File:Soleure_Aa_0012.tif the painter winterlin is in the category, has a template including personal information, just like the wikipedia article, and the wikidata entry. rupert. [1] https://commons.wikimedia.org/**wiki/Commons:** Zentralbibliothek_Z%C3%BCrichhttps://commons.wikimedia.org/wiki/Commons:Zentralbibliothek_Z%C3%BCrich [2] https://commons.wikimedia.org/**wiki/Commons:** Zentralbibliothek_Solothurnhttps://commons.wikimedia.org/wiki/Commons:Zentralbibliothek_Solothurn __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org mailto:Wikidata-l@lists.** wikimedia.org Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Counting sitelinks of subclasses.
I would be surprised if that theory held true. I expect that both very abstract (fruit) and extremely specific (golden delicious) items would have a lower sitelink count than the golden layer of most useful terms (apple) in the hierarchy (I am reminded of the theory of word length and term frequency in linguistics). But I would assume that indeed in the subclass hierarchy that Wikidata will eventually exhibit would have such a golden layer (and that these terms are not randomly distributed over the hierarchy). Would be fun to examine :) Cheers, Denny 2013/9/24 Klein,Max kle...@oclc.org Hello All, It struck me that one interesting way to see if subclasses are useful was to test this hypothesis. Let QID_a and QID_b be two Wikidata items. Conjecture: if QID_b is subclass of QID_a, then count_stelinks(QID_b) = count_sitelinks(QID_a). Has anyone investigated this problem, or can think of an efficient way to test it? Or can tell me why it ought not to be true? Maximilian Klein Wikipedian in Residence, OCLC +17074787023 ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] test.wikidata.org offers URL datatype now
in the case of MediaWiki wikis, I guess http://en.wikipedia.org/wiki/** Technical_University_of_Denmarkhttp://en.wikipedia.org/wiki/Technical%20University%20of%20Denmark should be preferred, but that is besides the point. Yes, spaces are not escaped. That was intentional, as they are also not escaped when entering the URL in wiki syntax. Should we behave differently? Thank you for the tests! 2013/9/6 Finn Årup Nielsen f...@imm.dtu.dk It is apparently not possible to enter a URL with spaces and have it automatically escaped, e.g., http://en.wikipedia.org/wiki/**Technicalhttp://en.wikipedia.org/wiki/TechnicalUniversity of Denmark should be entered as: http://en.wikipedia.org/wiki/**Technical%20University%20of%**20Denmarkhttp://en.wikipedia.org/wiki/Technical%20University%20of%20Denmark On the other hand http://en.wikipedia.org/wiki/**Københavnhttp://en.wikipedia.org/wiki/K%C3%B8benhavnworks ok for http://en.wikipedia.org/wiki/**K%C3%B8benhavnhttp://en.wikipedia.org/wiki/K%C3%B8benhavn Also http://københavn.dk http://xn--kbenhavn-54a.dk works. see https://test.wikidata.org/**wiki/Q132https://test.wikidata.org/wiki/Q132and https://test.wikidata.org/**wiki/Q133https://test.wikidata.org/wiki/Q133 cheers Finn Årup Nielsen On 09/06/2013 12:10 PM, Denny Vrandečić wrote: Hello all, in preparation of next week's deployment to Wikidata.org, test.wikidata.org http://test.wikidata.org now has the new datatype URL deployed. If you have the time, we would appreciate if you tested it and let us know about errors and problems. The URL datatype should be a big step in allowing to introduce better sourcing and reliability of the content of Wikidata. Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] The Day the Knowledge Graph Exploded
Just a few corrections to the historical dates given by Tom. 2013/8/23 Tom Morris tfmor...@gmail.com In a word, no. Google acquired Metaweb, the company that built Freebase, which forms the core of the Knowledge Graph in 2010. Metaweb was founded in 2005 (interesting Google search: Metaweb founding) and started extracting information from Wikipedia into Freebase in 2006. https://www.freebase.com/m/0gw0?linkslang=enhistorical=true The first DBpedia release was in 2007. Semantic information nets go back to the 60s. TBL coined the term semantic web in 2006. TBL coined the term semantic web at latest in 1994, probably even before (I don't have Weaving the Web at hand, but here are TBL's slides from the WWW conference in 1994: http://www.w3.org/Talks/WWW94Tim/) WikiData is a great project, but this progress has been building, excrutiatingly slowly, over decades. One could even make the argument that WikiData is the result of Knowledge Graph and its antecedents rather than the other way around. Wikidata is influenced by RDF (1999), OWL (2004), Semantic MediaWiki (2005), Freebase (2006), DBpedia (2007), Semantic Forms (2007), and many many other technologies that are less visible or don't have such a strong brand (and Michael is very aware of that history, he's been around years before working on the technologies these are based on). I understand Michael's question to be much more concrete: does the progress in Wikidata has anything to do with the changes in the Knowledge Graph's visibility in Google's searches that happened last month? Cheers, Denny Tom ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] The Day the Knowledge Graph Exploded
Oh, that's a clear and loud I have no idea :) 2013/8/23 Tom Morris tfmor...@gmail.com On Fri, Aug 23, 2013 at 10:10 AM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: I understand Michael's question to be much more concrete: does the progress in Wikidata has anything to do with the changes in the Knowledge Graph's visibility in Google's searches that happened last month? So, what's your opinion? Tom ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Make Commons a wikidata client
Hi Maarten, thanks. That's the best proposal I have seen so far in how to proceed with Phase 1 on Commons. I usually had pushed Commons support further to the back, but with this I think we would indeed create some real value with a small change. I will bounce Commons Phase 1 client support up on my list. I guess we should disallow sitelinks to the File: namespace, in order to avoid people trying to add metadata about the media files themselves? Cheers, Denny 2013/8/10 Maarten Dammers maar...@mdammers.nl Hi everyone, At Wikimania we had several discussions about the future of Wikidata and Commons. Some broader feedback would be nice. Now we have a property Commons category (https://www.wikidata.org/** wiki/Property:P373 https://www.wikidata.org/wiki/Property:P373). This is a string and an intermediate solution. In the long run Commons should probably be a wikibase instance in it's own right (structured metadata stored at Commons) integrated with Wikidata.org, see https://www.wikidata.org/wiki/**Wikidata:Wikimedia_Commonshttps://www.wikidata.org/wiki/Wikidata:Wikimedia_Commonsfor more info. In the meantime we should make Commons a wikidata client like Wikipedia and Wikivoyage. How would that work? We have an item https://www.wikidata.org/wiki/**Q9920https://www.wikidata.org/wiki/Q9920for the city Haarlem. It links to the Wikipedia article Haarlem and the Wikivoyage article Haarlem. It should link to the Commons gallery Haarlem (https://commons.wikimedia.**org/wiki/Haarlemhttps://commons.wikimedia.org/wiki/Haarlem ) We have an item https://www.wikidata.org/wiki/**Q7427769https://www.wikidata.org/wiki/Q7427769for the category Haarlem. It links to the Wikipedia category Haarlem. It should link to the Commons category Haarlem (https://commons.wikimedia.* *org/wiki/Category:Haarlemhttps://commons.wikimedia.org/wiki/Category:Haarlem ). The category item (Q7427769) links to article item (Q9920) using the property main category topic (https://www.wikidata.org/** wiki/Property:P301 https://www.wikidata.org/wiki/Property:P301). We would need to make an inverse property of P301 to make the backlink. Some reasons why this is helpful: * Wikidata takes care of a lot of things like page moves, deletions, etc. Now with P373 (Commons category) it's all manual * Having Wikidata on Commons means that you can automatically get backlinks to Wikipedia, have intro's for category, etc etc * It's a step in the right direction. It makes it easier to do next steps Small change, lot's of benefits! Maarten __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Help on adding a not exact date
This doesn't really work yet in the UI. Basically, you could only enter sth like 6th c. BC which in this case would not be correct. 1st Millenium BC would be possible and correct, but it is a bit too wide. That's the only thing supported right now. We will be working on improving this situation. Cheers, Denny 2013/8/22 Mathieu Stumpf psychosl...@culture-libre.org Hello, I'm beginning with wikidata and didn't find how to add a date which is not exact. In fact I found [1], which would gain clarity with an example, but the given format is refused when I try to use it. For example, I want to add birth and death date for Pittacus of Mytilene[2], for which I know no accurate value, but it something like -650 to -570, with something like a quarter century accuracy. Actually, looking at the English Wikipedia, -640 to -568 is given, but without sources unfortunatly. Well, for the sake of the example lets ignore that, what input should I provide to the birth day property to match my previous description? [1] https://meta.wikimedia.org/**wiki/Wikidata/Data_model#** Dates_and_timeshttps://meta.wikimedia.org/wiki/Wikidata/Data_model#Dates_and_times [2] https://www.wikidata.org/wiki/**Q311835https://www.wikidata.org/wiki/Q311835 -- Association Culture-Libre http://www.culture-libre.org/ __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Phase #3 deadline
Hi Jan, we currently assume that we will have a first querying capability available this fall. The implementation has progressed very well in the last few months and weeks, including special pages to access it, API modules, etc. Indeed querying will be available later than originally anticipated since we had reprioritized it, and because of that we had much less people working on this functionality (for a while, it was only one person working on this), and other tasks were moved to higher prio, such as more data types, better history support, allowing arbitrary access to items in the clients, support for other sister projects, etc. By the way, we mostly dropped the idea of phases to speak about development goals as it doesn't really fit the current development plan, but that's just a naming issue. So expect some simple querying capability (give me all items with a specific value on this property) to be deployed within the next month or three, but don't be mad if we slip by a few weeks due to some unexpected deployment issue. Cheers, Denny 2013/8/21 Jan Kučera kozuc...@gmail.com Hi there, how is the development of phase #3 (lists) going? Is it due to soon? Sub-question: I guess sorting feature in lists will be implemented in default as list without sorting would be a bad idea? Thx for answer. Cheers, Kozuch ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] [Wikimedia-l] Meeting about the support of Wiktionary in Wikidata
[Sorry for cross-posting] Yes, I agree that the OmegaWiki community should be involved in the discussions, and I pointed GerardM to our proposals whenever and discussions, using him as a liaison. We also looked and keep looking at the OmegaWiki data model to see what we are missing. Our latest proposal is different from OmegaWiki in two major points: * our primary goal is to provide support for structured data in the Wiktionaries. We do not plan to be the main resource ourselves, where readers come to in order to look up something, we merely provide structured data that a Wiktionary may or may not use. This parallels the role of Wikidata has with regards to Wikipedia. This also highlights the difference between Wikidata and OmegaWiki, since OmegaWiki's goal is to create a dictionary of all words of all languages, including lexical, terminological and ontological information. * a smaller difference is the data model. Wikidata's latest proposal to support Wiktionary is centered around lexemes, and we do not assume that there is such a things as a language-independent defined meaning. But no matter what model we end up with, it is important to ensure that the bulk of the data could freely flow between the projects, and even though we might disagree on this issue in the modeling, it is ensured that the exchange of data is widely possible. We tried to keep notes on the discussion we had today: http://epl.wikimedia.org/p/WiktionaryAndWikidata My major take home message for me is that: * the proposal needs more visual elements, especially a mock-up or sketch of how it would look like and how it could be used on the Wiktionaries * there is no generally accepted place for a discussion that involves all Wiktionary projects. Still, my initial decision to have the discussion on the Wikidata wiki was not a good one, and it should and will be moved to Meta. Having said that, the current proposal for the data model of how to support Wiktionary with Wikidata seems to have garnered a lot of support so far. So this is what I will continue building upon. Further comments are extremely welcomed. You can find it here: http://www.wikidata.org/wiki/Wikidata:Wiktionary As said, it will be moved to Meta, as soon as the requested mockups and extensions are done. Cheers, Denny 2013/8/10 Samuel Klein meta...@gmail.com Hello, On Fri, Aug 9, 2013 at 6:13 PM, JP Béland lebo.bel...@gmail.com wrote: I agree. We also need to include the Omegawiki community. Agreed. On Fri, Aug 9, 2013 at 12:22 PM, Laura Hale la...@fanhistory.com wrote: Why? The question of moving them into the WMF fold was pretty much no, because the project has an overlapping purpose with Wiktionary, This is not actually the case. There was overwhelming community support for adopting Omegawiki - at least simply providing hosting. It stalled because the code needed a security and style review, and Kip (the lead developer) was going to put some time into that. The OW editors and dev were very interested in finding a way forward that involved Wikidata and led to a combined project with a single repository of terms, meanings, definitions and translations. Recap: The page describing the OmegaWiki project satisfies all of the criteria for requesting WMF adoption. * It is well-defined on Meta http://meta.wikimedia.org/wiki/Omegawiki * It describes an interesting idea clearly aligned with expanding the scope of free knowledge * It is not a 'competing' project to Wiktionaries; it is an idea that grew out of the Wiktionary community, has been developed for years alongside it, and shares many active contributors and linguiaphiles. * It started an RfC which garnered 85% support for adoption. http://meta.wikimedia.org/wiki/Requests_for_comment/Adopt_OmegaWiki Even if the current OW code is not used at all for a future Wiktionary update -- and this idea was proposed and taken seriously by the OW devs -- their community of contributors should be part of discussions about how to solve the Wiktionary problem that they were the first to dedicate themselves to. Regards, Sam. ___ Wikimedia-l mailing list wikimedi...@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Some Wiktionary data in Wikidata - Updated proposal
Following numerous discussions and after input from many people, we are happy to present the new version of the proposal that would lead to Wikidata supporting structured data for the Wiktionaries. http://www.wikidata.org/wiki/Wikidata:Wiktionary I am very thankful to all those that provided input, and am also happy to be able to send this out before Wikimania, and thus potentially have a good discussion there. But obviously everyone feel free to chime in beyond that. I would be glad if I could again ask the community to spread the word to the Wiktionary communities, in order to gain as much feedback from them as possible, and hopefully support. Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] All interwikis from nl.wikivoyage have been moved to Wikidata
That's amazing! And so fast! Any idea how many links there have been? (Just curious). Thanks for reporting! Denny 2013/7/29 Romaine Wiki romaine_w...@yahoo.com Hello all, I am happy to announce that all interwikis from all articles, templates, project pages (except some archive pages) have been moved to Wikidata. This includes the removal of all local interwikis. With this I roughly checked all pages if they are connected to the right article on Wikidata. I solved a lot of interwikiconflict, often with disambiguation pages. I also made sure that every articles has an item on Wikidata. The Dutch Wikivoyage is the first Wikivoyage that fully switched to Wikidata. Greetings, Romaine ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata Map Interface
Hi Jacobo, I hope you don't mind that I share the answer with the list. I think the answer to this question might be of general interest. the JavaScript creating the visualization in the browser is here: https://dl.dropboxusercontent.com/u/172199972/map/map.js As you can see it is just a simple usage of HTML5 canvas. It requires two data files as these (careful, large): https://dl.dropboxusercontent.com/u/172199972/map/wdlabel.js https://dl.dropboxusercontent.com/u/172199972/map/graph.js The first contains all items, their latlong, and their label. The second contains the graph, the way items are connected to each other. The latter two files are created by the following Python scripts, in two steps. First, you need to create the knowledge base. This can be done with the following scripts: https://github.com/mkroetzsch/wda Use there the script https://github.com/mkroetzsch/wda/blob/master/wda-analyze-edits-and-write-kb.py Careful when you run it, it will download all Wikidata dumps. This might need a few free Gigabyte and a decent internet connection. Now, you should have the file kb.txt.gz, containing the knowledge base. By the way, you can also download the knowledge base as it is created nightly by us here: https://dl.dropboxusercontent.com/u/172199972/kb.txt.gz Finally, you will need a few scripts from here: https://github.com/vrandezo/wikidata-analytics Run them in the following order: geolabel.py - extracts a list of all locations and their label from the knowledge base https://github.com/vrandezo/wikidata-analytics/blob/master/geolabel.py geolabel2wdlabel.py - transforms the list to JavaScript for ready consumption by the Wikidata Map Interface https://github.com/vrandezo/wikidata-analytics/blob/master/geolabel2wdlabel.py geo.py - extract a list of all locations from the knowledge base https://github.com/vrandezo/wikidata-analytics/blob/master/geo.py graph.py - extracts the simple knowledge graph from the knowledge base https://github.com/vrandezo/wikidata-analytics/blob/master/graph.py geograph.py - extracts the part of the simple knowledge graph that connects geographical items with each other (needs geo and graph) https://github.com/vrandezo/wikidata-analytics/blob/master/geograph.py geograph2geojs.py - transforms the geograph to JavaScript for ready consumption by the Wikidata Map Interface https://github.com/vrandezo/wikidata-analytics/blob/master/geograph2geojs.py This should you give the two files wdlabel.js and graph.js, which will be called by the Wikidata Map Interface (see it's HTML source in order to see how). This process is run nightly on a machine we have standing here in the office. I am planning to set this up on labs, but didn't find the time yet. I hope this helps, Denny 2013/7/29 Jacobo Nájera jac...@metahumano.org Hi Denny, I am interested in Wikidata Map Interface, Where can i see and download the code? I want to experiment and document with it. Thanks, Jacobo -- Wikimedia México -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Watson receives Feigenbaum Prize - Money donated to Wikimedia and Wikidata
The AAAI awards the Feigenbaum Prize to the Watson team, which decides to donate the prize money to the Wikimedia Foundation, explicitly listing Wikidata as a reason. When asked for a comment, Wikidata said: Q2013 P3 Q12253 . Congratulations to the Watson team and their stunning results! More Info: http://blog.wikimedia.org/2013/07/16/ibm-research-watson-aaai-prize-wikimedia-foundation/ Deutsch: http://blog.wikimedia.de/2013/07/16/ibm-research-spendet-preisgeld-des-aaai-feigenbaum-preises-fur-watson-an-die-wikimedia-foundation/ -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] 239 Million language links removed from the Wikipedias
In June 2012 I ran an analysis to discover how many language links were on Wikipedia. Last week, I rerun the analysis again - and the results are stunning. Of the 240 Million language links, 239.2 Million have been removed so far. This is an amazing result by the community. Congratulations. Last year, 4.9 GB of text was required to represent the language links. These have almost completely gone. And whereas last year for smaller Wikipedias the language links made a substantial part of their content, they have no almost completely disappeared. Congratulations! Let's get ready for having the same positive effect on Wikivoyage, starting next week! (Note that the deployment might happen on a Tuesday for a change, as Monday will be blocked for a few other deployments) Here is the full data: 2013 analysis: http://simia.net/languagelinks/2013.html 2012 analysis: http://simia.net/languagelinks/index.html Addshore is currently working on getting some actionable analytics out of the dumps, in order to deal with the last remaining language links. Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Propagation of changes to the Wikipedias currently lagging
copied from http://www.wikidata.org/wiki/Wikidata:Project_chat#Propagation_of_changes_to_the_Wikipedias_currently_lagging Changes to Wikidata are currently propagated to the Wikipedias with a lag of several hours, but this should be fixed during the next few hours. The Dispatcher, who is responsible to push the edits from Wikidata to the individual Wikipedias, choked yesterday on some edit. We did not notice until the morning (thanks to the community for reporting on various places). We got the Dispatcher running again. The backlog then was about 19 hours, and is now going down again, it seems roughly at a rate of two hours per one hour, so it should have caught up in about half a day. You can see the [[Special:DispatchStats|current status on wiki]]. We currently do not know why the Dispatcher got stalled, and also not on which edit exactly. We simply skipped a few edits, and it started working again. We will continue investigating. Because of that, it might happen again any time. We keep watching the stats. A detailed description of our current status can be found on the [ http://article.gmane.org/gmane.org.wikimedia.wikidata.technical/117Wikidata-tech mailing list]. --[[User:Denny|Denny]] ([[User talk:Denny|talk]]) 11:32, 13 July 2013 (UTC) -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] A personal note, and a secret
I am truly and deeply amazed by the Wikidata community. A bit more than a year ago, I moved to Berlin and assembled a fantastic team of people to help realize a vision. Today, we have collected millions of statements, geographical locations, points in time, persons and their connections, creative works, and species - and every single minute, hundred of edits are improving and changing this knowledge base that anyone can edit, that anyone can use for free. So much more is left to do, and the further we go, the more opportunities open. More datatypes - links are on the horizon, quantities will be a major step. I can hardly wait to see Wikidata answer queries. And there are so many questions unanswered - what does the community need in order to maintain Wikidata best? Which tools, reports, special pages are needed? What is the right balance between automation and flexibility? Besides Wikipedia, Wikidata can be used in many other places. We just started the conversations about sister projects, but also external projects are expected to become smarter thanks to Wikidata. I expect tools and libraries and patterns for these type of uses will emerge in the next few months, and applications will become more intelligent and act more informed, powered by Wikidata. A project like Wikidata needs in its early days a strong, sometimes stubborn leader in order to accelerate its growth. But at some point a project gathers sufficient momentum, and the community moves faster than any single leader could lead, and suddenly they might become bottlenecks, and instead of accelerating the project the might be stalling it. Wikidata has reached the point where it is time for me to step down. The Wikidata development team in Berlin will, in the upcoming weeks and months, set up processes that allow the community, that I learned to trust even more during that year, to take over the reigns. I will stay with the team until the end of September, and then become again what I have been for the last decade - a normal and proud member of the Wikimedia communities. I also would like to use this chance to reveal a secret. Wikidata items are identified by a Q, followed by a number, Wikidata properties by a P, followed by a number. Whereas it is obvious that the P stands for property, some of you have asked - why Q? My answer was, that Q not only looks cool, but also makes for great identifiers, and hopefully a certain set of people will some day associate a number like Q9036 with something they can look up in Wikidata. But the true reason is that Q is the first letter of the name of the woman I love. We married last year, among all that Wikidata craziness, and I am thankful to her for the patience she had while I was discussing whether to show wiki identifiers or language keys, what bugs to prioritize when, and which calendar systems were used in Sweden. I will continue to be a community member with Wikidata. My new day job, though, will be at Google, and from there I hope to continue to effectively further our goals towards a world where everyone has access to the sum of all knowledge. Sincerely, Denny Vrandečić -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata Properties Search
Hi Hady, use the MediaWiki API, like this: http://www.wikidata.org/w/api.php?action=querylist=allpagesformat=jsonapnamespace=120aplimit=10 You can list through all the results using http://www.wikidata.org/w/api.php?action=querylist=allpagesformat=jsonapnamespace=120aplimit=10apcontinue=P110 etc. 2013/7/10 Hady elsahar hadyelsa...@gmail.com HI, i'm playing around with creating some mappings between DBpedia properties and WikiData ones . in order to visualize the properties and make an easy search , i wanted to create a simple file contains the properties URI and their labels in english i tried to scrab those uris from 1 to 200 for example http://www.wikidata.org/wiki/Special:EntityData/P164.nt i noticed that WikiData properties are not in order by number as the entities , lots of properties are not there. is there a better practice or just ignoring empty properties. which number intervals should i use ? thanks Regards - Hady El-Sahar Research Assistant Center of Informatics Sciences | Nile Universityhttp://nileuniversity.edu.eg/ email : hadyelsa...@gmail.com Phone : +2-01220887311 http://hadyelsahar.me/ http://www.linkedin.com/in/hadyelsahar ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Update nl-wiki request for bot
I just wanted to say thank you! That's truly amazing work. As far as I can tell, more than 200 Million lines of wikitext have so far been removed from the Wikipedias. That's 200 Million lines that do not have to maintained anymore. (I have not run the actual analysis yet, I have been waiting for the bots to finish their job, but maybe I should as it is pretty much exactly a year since I run the analysis on the pre-Wikidata age Wikipedia dumps). You are amazing! Cheers, Denny 2013/7/8 addshorewiki addshorew...@gmail.com For the bot removing interwiki links that are redirects etc my new code should be ready by this weekend (I hope) and this should give the lists I have a big clear out! :) Addshore On 8 Jul 2013 04:32, Romaine Wiki romaine_w...@yahoo.com wrote: Today we reached at nl-wiki the situation that + 64% of the interwikiconflicts have been solved. A lot of this work has been done by the Dutch community, but also a lot of work is done by users form other projects, thank you very much for the help! I have checked the complete template namespace and category namespace for local interwiki's and all are removed from these pages, so these namespaces are now clean on nl-wiki. If users from especially smaller Wikipedia's want to know on what pages of their wiki are local interwikis left, you can use AWB, download the latest databasedump and do a query on that dump. If you want to know what query you need exactly, e-mail me personally as the string of the query is a bit long. But it is even for noobs on bots and codes easy to do. (I can also do it for you.) With doing all this solving of interwikiconflicts, we came across several things: * A lot of biological conflicts are in our list of interwikiconflicts. Certain genus do only have one species under it, what makes some Wikipedias make that together one article, while others want two articles as it are two layers in the taxonomical tree. One article on the English Wikipedia that created hundreds of interwikiconflicts was a list to which many redirects were linking which were used for interwikis. All have been removed with a bot. * Another thing we notice is that a lot of renamings of articles to make place for a disambiguation page haven't been proparly executed, as on Wikidata in an item of a group of articles, one of the links was to a disambiguation page. (It would be nice if a bot could check for disambiguation pages (based on the presence of a template from [[MediaWiki:Disambiguationspage]] on that wiki in it) so that we know where we need to fix this.) * Another thing we see is that a lot of interwikis are still local because the local interwiki links to a page that is a redirect because the page was renamed, while this wasn't changed by a bot. Most interwikibots do not recognize that the redirect is the same page as the one added to Wikidata. So we need a bot to remove all interwikis that link to a redirect linking to a page that is in the same item as the page where the local interwikis are in. Let's clean this mess up! Romaine --- http://www.wikidata.org/wiki/User:Romaine ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Some Wiktionary data in Wikidata
done this change. 2013/6/20 Denny Vrandečić denny.vrande...@wikimedia.de Thinking about it again, and discussing it internally, maybe we should replace word with expression and meaning with sense? Any +1's or differing opinions? 2013/6/20 Denny Vrandečić denny.vrande...@wikimedia.de The current proposal does not cover grammar rules explicitly. If at all, I would regard that as a later extension once the lexical information is in place. Also, my limited understanding of the topic does not even allow for coming up with a data model to cover grammar rules, or to know whether there is something like sufficiently widely accepted models to represent grammar, or if there are still discussions whether Chomsky or Systemic Functional Grammars or whatever else would make the cut... Regarding word vs expression - I do not care much about the actual term, and it seems that both seem valid. With the suggested change from meaning to word sense though, it might make more sense to keep word here. But as said, no strong opinion here. I definitively see that saying that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a word is kinda weird. expression would fix that. Any further opinions? Cheers, Denny 2013/6/19 David Cuenca dacu...@gmail.com Hi Denny, Thank you very much for this fantastic update about the intentions of supporting a semantic dictionary in Wikidata :) Just a minor correction: I think instead of word, it should be expression because some languages don't follow the same logic. On the other hand, do you think it would be possible to accommodate grammar rules too? I have added some people from Apertium that might have some insights about it. Cheers, Micru On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Etiamsi omnes, ego non -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l
Re: [Wikidata-l] Some Wiktionary data in Wikidata
It was never intended to create a Wiktionary Database separate from Wikidata, but have it being a part of Wikidata. 2013/6/21 Gerard Meijssen gerard.meijs...@gmail.com Hoi, Denny, when you look at the data currently in Wikidata, you find what is in essence more than a basis for a translation dictionary. The notion that we need something separate is a notion you should reasses. What we need is some clean-up of the labels currently in use. What we also need are more definitions. We do not need another Wikidata for Wiktionary Thanks, GerarM Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Some Wiktionary data in Wikidata
Thank you, Sundar! 2013/6/20 BalaSundaraRaman sundarbe...@yahoo.com Hi Denny, I've left a message at the Tamil Wiktionary Village Pump. http://ta.wiktionary.org/w/index.php?title=%E0%AE%B5%E0%AE%BF%E0%AE%95%E0%AF%8D%E0%AE%9A%E0%AE%A9%E0%AE%B0%E0%AE%BF:%E0%AE%86%E0%AE%B2%E0%AE%AE%E0%AE%B0%E0%AE%A4%E0%AF%8D%E0%AE%A4%E0%AE%9F%E0%AE%BFdiff=1194066oldid=1194039 Cheers, Sundar That language is an instrument of human reason, and not merely a medium for the expression of thought, is a truth generally admitted. - George Boole, quoted in Iverson's Turing Award Lecture Original message: Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Some Wiktionary data in Wikidata
Thanks, I did, and did now again. As far as I can tell, it seems compatible (and even would be compatible with the simpler current Wikidata model, actually). Cheers, Denny 2013/6/19 Tom Morris tfmor...@gmail.com If you haven't already, it might be worth looking at the Freebase schema for Wordnet, especially how it connects synsets to Freebase topics: https://www.freebase.com/base/wordnet/synset?schema= Tom On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Some Wiktionary data in Wikidata
The current proposal does not cover grammar rules explicitly. If at all, I would regard that as a later extension once the lexical information is in place. Also, my limited understanding of the topic does not even allow for coming up with a data model to cover grammar rules, or to know whether there is something like sufficiently widely accepted models to represent grammar, or if there are still discussions whether Chomsky or Systemic Functional Grammars or whatever else would make the cut... Regarding word vs expression - I do not care much about the actual term, and it seems that both seem valid. With the suggested change from meaning to word sense though, it might make more sense to keep word here. But as said, no strong opinion here. I definitively see that saying that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a word is kinda weird. expression would fix that. Any further opinions? Cheers, Denny 2013/6/19 David Cuenca dacu...@gmail.com Hi Denny, Thank you very much for this fantastic update about the intentions of supporting a semantic dictionary in Wikidata :) Just a minor correction: I think instead of word, it should be expression because some languages don't follow the same logic. On the other hand, do you think it would be possible to accommodate grammar rules too? I have added some people from Apertium that might have some insights about it. Cheers, Micru On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Etiamsi omnes, ego non -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Some Wiktionary data in Wikidata
Thinking about it again, and discussing it internally, maybe we should replace word with expression and meaning with sense? Any +1's or differing opinions? 2013/6/20 Denny Vrandečić denny.vrande...@wikimedia.de The current proposal does not cover grammar rules explicitly. If at all, I would regard that as a later extension once the lexical information is in place. Also, my limited understanding of the topic does not even allow for coming up with a data model to cover grammar rules, or to know whether there is something like sufficiently widely accepted models to represent grammar, or if there are still discussions whether Chomsky or Systemic Functional Grammars or whatever else would make the cut... Regarding word vs expression - I do not care much about the actual term, and it seems that both seem valid. With the suggested change from meaning to word sense though, it might make more sense to keep word here. But as said, no strong opinion here. I definitively see that saying that http://en.wiktionary.org/wiki/carry_coals_to_Newcastle is a word is kinda weird. expression would fix that. Any further opinions? Cheers, Denny 2013/6/19 David Cuenca dacu...@gmail.com Hi Denny, Thank you very much for this fantastic update about the intentions of supporting a semantic dictionary in Wikidata :) Just a minor correction: I think instead of word, it should be expression because some languages don't follow the same logic. On the other hand, do you think it would be possible to accommodate grammar rules too? I have added some people from Apertium that might have some insights about it. Cheers, Micru On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Etiamsi omnes, ego non -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Some Wiktionary data in Wikidata
Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Visualisations of The Most Unique Wikipedias According to Wikidata
Can I have a statement about how much easier it would have been with Wikidata? :) 2013/6/13 Brent Hecht bhe...@cs.umn.edu Hi all, In my (recently finished) thesis, I looked at a lot of different properties (e.g. topic, centrality, popularity via pageviews) of common and unique concepts across multilingual Wikipedia. It's all in Chapter 3: http://www-users.cs.umn.edu/~bhecht/publications/bhecht_thesis_final.pdf. A lot of these questions were addressed in the pre-Wikidata era :-) - Brent Brent Hecht, Ph.D. Assistant Professor Department of Computer Science and Engineering University of Minnesota e: bhe...@cs.umn.edu t: @bhecht w: http://www-users.cs.umn.edu/~bhecht/ On Jun 13, 2013, at 12:33 PM, Klein,Max kle...@oclc.org wrote: That's an excellent recommendation. I will attempt to research the common properties of the least unique Wikidata items. Maximilian Klein Wikipedian in Residence, OCLC +17074787023 From: wikidata-l-boun...@lists.wikimedia.org on behalf of Paul A. Houle Sent: Thursday, June 13, 2013 6:57 AM To: Discussion list for the Wikidata project. Subject: Re: [Wikidata-l] Visualisations of The Most Unique Wikipedias According to Wikidata I think Poland may do better than average because Polish people, out of national pride, have made a special effort to be well documented in English Wikipedia and represent a Polish point-of-view on topics like the city of Gdansk. One fascinating thing about Wikidata is that it provides access to all of the wonderful concepts shared in the Wikiverse, so now sites like Ookaboo can collect pictures of many beautiful places that don't exist in en Wikipedia. On the other hand I'm also interested in the other end of the curve, those elite concepts which are represented widely across the Wikipedias. Surely this is connected with subjective importance, with some flavor towards global appeal, whatever that would turn out to mean. Any chance you could run a report on those? -Original Message- From: Mathieu Stumpf Sent: Thursday, June 13, 2013 4:51 AM To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Visualisations of The Most Unique Wikipedias According to Wikidata Le 2013-06-12 22:22, Klein,Max a écrit : Hello Wikidatians, I made a few visualizations of the distributions of language links in Wikidata Items. You can also use these stats to see which Items represent wikipedia articles which are unique to a language and compare the uniquenesses of all languages. Also I investigate all the items with just two language links, to look at Wikipedia pairs See the full analysis: http://notconfusing.com/the-most-unique-wikipedias-according-to-wikidata/ [1] Interesting! Could you also create that kind of visualisations by topics : how much uniqueness come from biographies of local football people, compared with history events or abstract concepts ? Also, in a completly unrelated topic, you may explain me in private what you mean with Create a communal house to live in which is in your public todo list, it sounds interesting. :P -- Association Culture-Libre http://www.culture-libre.org/ ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Some Wiktionary data in Wikidata
It's not about the actual content, but rather about the data model. 2013/6/19 Neil Harris n...@tonal.clara.co.uk On 19/06/13 15:03, Tom Morris wrote: If you haven't already, it might be worth looking at the Freebase schema for Wordnet, especially how it connects synsets to Freebase topics: https://www.freebase.com/base/wordnet/synset?schema= Tom WordNet does not seem to be under a free license -- see http://wordnet.princeton.edu/wordnet/license/ Since Wikidata's CC0 licensing allows commercial use, surely integrating any kind of data from WordNet risks conflict with WordNet's license? Neil On Wed, Jun 19, 2013 at 9:57 AM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: Hello, I would like all interested in the interaction of Wikidata and Wiktionary to take a look at the following proposal. It is trying to serve all use cases mentioned so far, and remain still fairly simple to implement. http://www.wikidata.org/wiki/Wikidata:Wiktionary http://www.wikidata.org/wiki/Wikidata:Wiktionary To the best of our knowledge, we have checked all discussions on this topic, and also related work like OmegaWiki, Wordnet, etc., and are building on top of that. I would extremely appreciate if some liaison editors could reach out to the Wiktionaries in order to get a wider discussion base. We are currently reading more on related work and trying to improve the proposal. It would be great if we could keep the discussion on the discussion page on the wiki, so to bundle it a bit. Or at least have pointers there. http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary http://www.wikidata.org/wiki/Wikidata_talk:Wiktionary Note that we are giving this proposal early. Implementation has not started yet (obviously, otherwise the discussion would be a bit moot), and this is more a mid-term commitment (i.e. if the discussion goes smoothly, it might be implemented and deployed by the end of the year or so, although this depends on the results of the discussion obviously). Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing listWikidata-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Usage of Wikidata: the brilliance of Wikipedians
I am completely amazed by a particularly brilliant way that Wikipedia uses Wikidata. Instead of simply displaying the data from Wikidata and removing the local data, a template and workflow is proposed, which... * grabs the relevant data from Wikidata * compares it with the data given locally in the Wikipedia * displays the Wikipedia data * adds a maintenance category in case the data is different This allows both communities to check the maintenance category, provide a security net for vandal changes, still notice if some data has changed, etc. -- and to phase out the local data over time when they get comfortable and if they want to. It is a balance of maintenance effort and data quality. I am not saying that is the right solution in every use case, for every topic, for every language. But it is a perfect example how the community will surprise us by coming up with ingenious solutions if they get enough flexibility, powerful tools, and enough trust. Yay, Wikipedia! The workflow is described here: http://en.wikipedia.org/wiki/Template_talk:Commons_category#Edit_request_on_24_April_2013:_Check_Wikidata_errors There is an RFC currently going on about whether and how to use Wikidata data in the English Wikipedia, coming out of the discussion that was here a few days ago. If you are an English Wikipedian, you might be interested: http://en.wikipedia.org/wiki/Wikipedia:Requests_for_comment/Wikidata_Phase_2 -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] How does wikidata handle topic redirect/merge/split
2013/4/7 Jianyong Zhang zhjy...@gmail.com On Tue, Apr 2, 2013 at 9:54 PM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: 2013/4/1 Jianyong Zhang zhjy...@gmail.com 1) It becomes redirect to another article, will Qx be changed in this scenario? I expect that if a Wikipedia article gets moved, this will be updated on the Wikidata item manually. Otherwise the language links that were displayed on the original article would not show up. If an article gets turned into a redirect to an already existing article, this would be a merge (see Question 4). Thanks for the detailed reply. I still have some question on redirect. See, if the article A redirects to the article B, will we have 2 items or only one for the final targets and all its redirects? From my point of view, if a redirect talks about same topic as its final target, then it makes sense to only have 1 item. Such as, http://en.wikipedia.org/wiki/Obama and http://en.wikipedia.org/wiki/Obamaand http://en.wikipedia.org/wiki/Barack_Obama. But in many cases, a redirect talks about a related but different topic with its final target. Such as, http://en.wikipedia.org/wiki/Social_activist and http://en.wikipedia.org/wiki/Activism. how will wikidata handle such redirects? And back to the original question, if an article becomes a redirect, for the above 2 different scenarios,what will we do? In general, you can't point to a redirect from Wikidata. When entering, Wikidata tries to resolve it to the redirected article, tries to save it - and if there is already an item linked to an article, the save will fail. If someone turns an article into a redirect, we don't notice that automatically. I hope that bots will clean that up over time, but I would consider that an issue with the data. Any data on that item cannot be used on the Wikipedia article, since it is a redirect. I hope that helps, Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Our log table is exploding
Hey all, I just got a warning from Ops that our log table is growing extremely fast. One write up by this is here: https://bugzilla.wikimedia.org/show_bug.cgi?id=47415 Basically, a vast majority of edits on Wikidata are written to the log table as they are autopatrolled. And since we have a lot of edits, this makes the table grow very very quickly. We would like to: * stop logging so many edits * drop those logs that are already there about patrolling We want to understand how that influences your workflows and what we can do about that. Please speak up if this change would be an issue. Cheers, Denny -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Qualifiers, bug fixes, improved search - all in one night!
Hey Dario, there is on simple fix we want to apply rather sooner than later, which is to use the number of language links for ranking. This should work rather well. The thing is that this is kinda hard to implement in MySQL, I figured, and that we would need to use something Lucene based (probably Solr) for that. The Solr extension is quite far, but we currently are not working on getting it deployed. In short, it's all in the pipes, and it just takes a bit... Cheers, Denny 2013/4/19 Dario Taraborelli dtarabore...@wikimedia.org Hi Lydia and all, great to hear about this deployment, I am particularly excited about qualifier support (as per my previous post). Since you also mention improvements to search, I was wondering whether you had specific plans for work on search functionality. Unless I use the Items by title page, if type Berlin in a regular search form the item I am actually looking for (Q64) is ranked #34 in the search results (i.e. three clicks away on the more link). I'd be curious to hear the team's thoughts on how to make search more effective and user friendly. Dario On Apr 18, 2013, at 2:26 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: Heya folks :) We have just deployed qualifiers ( http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer#Qualifiers ) and bug fixes on wikidata.org. Qualifiers! Bug fixes are especially for Internet Explorer 8. Please let me know how it is working for you now if you're using IE8 and if there are still any major problems when using Wikidata with it. In addition the script we ran on the database to make search case-insensitive has finished running. This should be another huge step towards a nice search here. (This change also affects the autocomplete for items and properties.) As usual please let me know what you think and tell me if there are any issues. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
This is in my opinion an upstream issue for MediaWiki proper. I do not think that templates and images from Commons are that different. Take this image for example: https://en.wikipedia.org/wiki/File:Treaty_of_Accession_2011_Ratification_Map.svg It always reflects the current state of ratification. Take the templates that display the conservation status of species in Wikipedia. It encodes a whole lot of knowledge about different preservation status systems, and if they change, this is also not preserved anywhere in the history. https://en.wikipedia.org/wiki/Wikipedia:Conservation_status I agree that this is an issue. But our solution is consistent with the way it is done in other parts of Wikipedia, and a solution should not be partially addressing Wikidata but Wikipedia as a whole. One way would be to great HTML dumps of Wikipedia at regular intervals, as, e.g., the Internet Archive does it. A much more thorough discussion of this issue can be found here in a RENDER deliverable I was co-authoring in 2010: http://render-project.eu/wp-content/uploads/2010/05/D1.1.2.pdf Cheers, Denny 2013/4/4 Gregor Hagedorn g.m.haged...@gmail.com when templates (or, in the case of wikidata, properties) get deleted or renamed. Nobody has come up with a good solution yet. I think we did discuss a simple, working solution: Saving the value together with the Wikipedia page. The major argument against that was: it is a waste of storage to create a new Wikipedia page (perhaps daily) when property values included in a page are changed in Wikidata. I personally value trust and documentation of change much higher than disk storage, but even then, there are ways to balance this. So perhaps a modified proposal that matches the current development stage: If an editor saves a page with {{#property:population}} the parser looks up the current value and changes this to: {{#property:population|current value=2348732}} and stores this wikitext version in the Wikipedia. The same would apply to updating, saving {{#property:population|current value=2348732}} may result in {{#property:population|current value=2348700}} being saved. This would mean no additional waste of storage for articles that are regularly changed. For those that are not, one could imagine a bot-based monthly update check to make past knowledge transparent. I realize that this would require a pattern, where the Wikidata-derived values would remain editable on the topic/article pages, i.e. the property function would have to be inserted in the template call, rather than in the template definition. Those wikidata properties automatically called inside templates with a dynamic item decided by the current template call would not be preserved. However, both editing patterns would be available and it would be up to the community of each Wikipedia to choose the preferred one. (As I said previously: although similar to the issue of commons images and templates, the issue at stake for Wikidata is different. Because of the problems in preserving a transparent editing history, updates to commons images are generally restricted to truly minor improvements (contrast, cropping, better resolution, etc.). I am not aware of cases, where commons images regularly are replaced with updated content that is different in substance and thus automatically changes all Wikipedia pages, representing different knowledge. I don't want to exclude this, but even for changing company logos the usual solution is to create a new name, preserving the old logo. Similarly, templates may fail to work in old versions (big problem!), but I am not aware that a template would render out-of-time information when viewing a past revision. Thus, the problem of Wikidata with respect to endangering the trust basis of Wikipedia, the version system, is related, but different). Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] [Wikitech-l] Wikidata queries
2013/3/28 Petr Onderka gsv...@gmail.com How will be the queries formatted? Do I understand it correctly that a QueryConcept is a JSON object? Not decided yet. Probably it will be a JSON object, though, and edited through an UI. Have you considered using something more in line with the format of action=query queries? Yes, but it didn't seem a good fit, especially because the query module works on the page metadata and the queries we discuss here work on the item data. Though I guess what you need is much more complicated and trying to fit it into the action=query model wouldn't end well. I am not even sure it is much more complicated. But I am very worried it is too different. Cheers, Denny Petr Onderka [[en:User:Svick]] 2013/3/28 Denny Vrandečić denny.vrande...@wikimedia.de: We have a first write up of how we plan to support queries in Wikidata. Comments on our errors and requests for clarifications are more than welcome. https://meta.wikimedia.org/wiki/Wikidata/Development/Queries Cheers, Denny P.S.: unfortunately, no easter eggs inside. -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikitech-l mailing list wikitec...@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list wikitec...@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] How does wikidata handle topic redirect/merge/split
Hi Alex, the current implementation of Wikidata supports the same level of history as MediaWiki itself, i.e. templates, images from Commons, and data from Wikidata have their own versioning scheme, and all information about their history is retained -- but when a page is rendered from a previous version, then the current templates, images, and data is being displayed. Whereas I know many people who share your opinion, I also think it is important to be consistent in this case. Cheers, Denny 2013/4/2 tanch...@mskcc.org Thank you Michael (and apologies to Denny for being addressed as Danny). We all know change is a constant and we need to design information technology with that in mind. As you noted, population is a great example of something that is constantly changing. Including a date with population makes sense and in certain situations. However, there maybe cases when the data related to specific article changes, that recent changes, watchlist and perhaps history will also be updated. This can only happen when there is two-way reference from the article to the data and back. Also preserving the context of a page, at that point in time can also be valuable. Anyway, perhaps coding for the date as in the population example will be sufficient in most cases. I do somehow feel that we should make g it easy for humans to create the articles and let the machine record the hard references and allow humans some means to recognize that the associated data in the articles they are subscribed to has changed. Thanks again. Best Regards, Alex On Apr 2, 2013, at 12:15 PM, Michael Hale hale.michael...@live.com wrote: Well you can still view the revision history of an item on Wikidata, as you'd expect. I view the information as being more tied to a specific reference than to a specific revision of the item. I don't think the notion of orphaned data is as big of a deal in a database as it is in an encyclopedia. We can monitor the creation of new items the same way that new articles are monitored on the encyclopedia. Especially with historical data, it might not be currently included in any sites that we know, but it should still be there for when people want to make historical charts for reports, school projects, etc. The two methods we have under development to improve the situation are ranks and qualifiers. Ranks let you differentiate between multiple claims about a property as to which one is preferred (likely the one with the most reputable reference) and qualifiers are that extra bit of information that let you differentiate multiple claims in a way that is appropriate for the property (perhaps a date for population values). Do you think these methods will be satisfactory for your concerns? -- From: tanch...@mskcc.org To: wikidata-l@lists.wikimedia.org Date: Tue, 2 Apr 2013 14:23:13 + Subject: Re: [Wikidata-l] How does wikidata handle topic redirect/merge/split Hi Danny, I'm been on the distribution list since the development of wikidata started and I think what everyone has set out to do and accomplished so far is amazing and will have a profound impact just as Wikipedia has. I've been quietly on the sidelines absorbing some (I have to admit I cant follow all) the intellectual discussions among the participants. I do have a thought about this issue of referential integrity and orphaned data that I'd like to share. Mediawiki has what links here to an article, at least for information residing on the same site. It also maintains what a page looks like at a point in time. Since data referenced on a specific edition/revision of an article can now reside outside of that article, the intent of the information in the article will be lost if it is not tied to the revision of the associated data when that information changes. One way that this can probably be handled in some future implementation, if not already done, is to also carry within the reference the timestamp of the referenced data as the reference backwards from the data. It will be difficult and cumbersome for humans to do this but as the link is stored in mediawiki site, code can be added to make the reference. In that process, it can also inform the host of the data, to add it to the what links here so there is a backward reference. To prevent spam and other issues such as performance, only approved sites (such as wikipedia sites) can be added to what links here. Feel free to include back the distribution list in your reply if you see merits in this suggestion. Best Regards, Alex On Apr 2, 2013, at 9:54 AM, Denny Vrandečić denny.vrande...@wikimedia.de wrote: Hi Janyong, as Michael said, Wikidata does not automatically get updated in any case. We are planning to improve a bit the experience with moving a page in the Wikipedias, but it won't become automatic. Mostly because these issues are in general
[Wikidata-l] Wikidata queries
We have a first write up of how we plan to support queries in Wikidata. Comments on our errors and requests for clarifications are more than welcome. https://meta.wikimedia.org/wiki/Wikidata/Development/Queries Cheers, Denny P.S.: unfortunately, no easter eggs inside. -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] A data model for Roman forts (castra)
Oh, I would please ask to wait another week or two, for us to have qualifiers. Maybe they can deal with some of these cases. We just got them demoed today, and they really look neat, so I am very convinced they will be there with the next update. 2013/3/27 Michael Hale hale.michael...@live.com I think you can use either in other contexts. I see the tradeoff being that the inclusion syntax and templates for referring to properties of another item might be slightly longer. For the example of children of Charles Dickens you definitely want to have each child be their own item, as it currently is: http://www.wikidata.org/wiki/Q5686. The question for construction phases of Roman forts is whether or not each phase has enough information to justify being a complete item. Although if you are ready to go right now then it isn't really a question because qualifiers aren't implemented yet. I suggest creating the extra properties you need and creating separate items for the construction phases, even if they only have a few properties each, and then later on if downgrading them to just qualified values of properties for the main fort item simplifies some of your queries, inclusion syntax usage, and template boxes, then you can certainly do that. -- Date: Wed, 27 Mar 2013 20:44:39 +0200 From: saturn...@gmx.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] A data model for Roman forts (castra) Thanks Denny and Michael, it really helps. It is indeed a major difference between string enumerations of XSD and the values that should link to other items resulting in a knowledge network. Choosing between lists of values and own item as value, I would prefer usually an own item because it could be used in other context, e.g. a sentence. A good example could be The children of Charles Dickens - Q were younger than On 03/27/2013 04:53 PM, Michael Hale wrote: Regarding the construction phases complex type that you want you have a couple of options. Properties support lists of values, so you could split the type into multiple properties and give a list of values for each. Then when qualifiers are added you could add a date range qualifier to each value to specify the phase or just use string qualifiers that say phase 1, phase 2, etc. depending on how detailed the information is. You can also group the properties in their own item. So you could create an item called Potaissa phase 1 construction details and then have construction phase just be a list of those specific items, which themselves contain the information. -- Date: Wed, 27 Mar 2013 12:55:18 +0100 From: denny.vrande...@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] A data model for Roman forts (castra) I would say this is a good starting point. The Wikidata data model is described in full detail here [1], and a introductory primer is given here [2]. Qualifiers are not implemented yet, but will be there soon, and followed by more datatypes (like time, geo, etc.). The major difference is that values like stone for material or opus-quadratum for technique should not be strings - this does not translate well. They should be pointing to items, e.g. Q8063 instead of stone and Q2631941 instead of opus-quadratum. The other thing is that Wikidata does not really intend to enable constraints in that very strong sense that your schema chooses. So if someone wants to add a value for material that you did not preconceive, like Q40861 (Marble), Wikidata-as-a-software will not stop them from doing so (just as Wikipedia-as-a-software does not stop you from entering that, either) (see also [3]). I hope this helps, Denny [1] http://meta.wikimedia.org/wiki/Wikidata/Data_model [2] http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer [3] http://blog.wikimedia.de/2013/02/22/restricting-the-world/ 2013/3/27 Flaviu fla...@gmx.com I fully agree with you that the XSD model cannot by precisely integrated into Wikidata and also I know Wikidata development is in progress. I think I could deal with simple properties like material but I'm not sure how to deal with complex properties like construction phases I'm not sure. Even if it is no implementation yet, how these complex properties could be defined? On 03/26/2013 11:53 PM, Michael Hale wrote: You can't integrate the XSD model precisely as it is defined because Wikidata doesn't allow all of the constraints that XSD allows. Specifically, you'll notice that you can't force an item to have a specific property (like the document or epigraphic reference in your model) and enumerations aren't currently supported. Wikidata has a global collection of properties and any item can use any arbitrary subset of them. The list is here: http://www.wikidata.org/wiki/Wikidata:List_of_properties. Some of the ones you want already exist, like
Re: [Wikidata-l] Expiration date for data
We do have strong types, but only few of time: item, commons media, string, time, geo, URL. Government leader would not be a supported type. The exact list and details are here: http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values Cheers, Denny 2013/3/21 Michael Hale hale.michael...@live.com That seems better to constrain the overall type of a qualifier to any property. It still doesn't feel exactly right, but I'm not sure what would. Now that I think about it more, for the case of heads of government it doesn't seem appropriate to use a qualifier at all to me. It would just be a list of items which are presumably people. Each of those items would then have a single date or list of dates for start of head of government and end of head of government. The qualifier would be redundant. It seems the downside to having everything be strongly typed like in Freebase is that you end up with really weird and specific entity types like government leadership timespan to try to capture all of the details that you want, and the downside to semi-weakly typed items in Wikidata is that you might end up with different items representing the same information with different properties or qualifiers. But I have faith that Wikidata will ultimately work and achieve stability and convergence for the most common types just like how template boxes naturally emerged on Wikipedia. And I think the key advantage of Wikidata is that it will achieve growth, stability, and convergence without suffocating from having too many weird and specific item types to try to bridge and glue different types of information together. -- Date: Thu, 21 Mar 2013 15:40:39 +0100 From: denny.vrande...@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers. Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April. Cheers, Denny 2013/3/20 Dario Taraborelli dtarabore...@wikimedia.org I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable. I don't think we should enforce typing for *all* qualifiers and I second the general organic growth approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for organic growth to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata. Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date. Dario On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael...@live.com wrote: For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a date range, not just a date. So do we want them to have to choose from a large, fixed list of qualifier types or just look at a similar example and set a string to something similar and then gradually enforce types on the most popular uses that we see. I think this type of organic growth as opposed to trying to guess the qualifier types in advance is exactly in the spirit of Wikipedia. -- Date: Fri, 15 Mar 2013 09:58:38 -0400 From: tfmor...@gmail.com To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data On Fri, Mar 15, 2013 at 1:49 AM, Michael Hale hale.michael...@live.com wrote: Yes, I think once qualifiers are enabled you would just have something like: ... Property(head of local government) ...
Re: [Wikidata-l] Expiration date for data
It really depends on your definitions :) Items are strongly typed as items. Any item can have any property. And only items can have properties. Time or geocoordinates, e.g., can not have properties. But yes, there is no forcing of properties onto any item, nor any restriction of usage of every property. See also here: http://blog.wikimedia.de/2013/02/22/restricting-the-world/ Cheers, denny 2013/3/21 Michael Hale hale.michael...@live.com Yes, I just meant that items aren't forced to have a specific set of properties by the software, so they are essentially weakly typed, right? -- Date: Thu, 21 Mar 2013 16:09:58 +0100 From: denny.vrande...@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data We do have strong types, but only few of time: item, commons media, string, time, geo, URL. Government leader would not be a supported type. The exact list and details are here: http://meta.wikimedia.org/wiki/Wikidata/Data_model#Datatypes_and_their_Values Cheers, Denny 2013/3/21 Michael Hale hale.michael...@live.com That seems better to constrain the overall type of a qualifier to any property. It still doesn't feel exactly right, but I'm not sure what would. Now that I think about it more, for the case of heads of government it doesn't seem appropriate to use a qualifier at all to me. It would just be a list of items which are presumably people. Each of those items would then have a single date or list of dates for start of head of government and end of head of government. The qualifier would be redundant. It seems the downside to having everything be strongly typed like in Freebase is that you end up with really weird and specific entity types like government leadership timespan to try to capture all of the details that you want, and the downside to semi-weakly typed items in Wikidata is that you might end up with different items representing the same information with different properties or qualifiers. But I have faith that Wikidata will ultimately work and achieve stability and convergence for the most common types just like how template boxes naturally emerged on Wikipedia. And I think the key advantage of Wikidata is that it will achieve growth, stability, and convergence without suffocating from having too many weird and specific item types to try to bridge and glue different types of information together. -- Date: Thu, 21 Mar 2013 15:40:39 +0100 From: denny.vrande...@wikimedia.de To: wikidata-l@lists.wikimedia.org Subject: Re: [Wikidata-l] Expiration date for data We will have a time datatype, and every property is strongly typed. This is also true for properties used as qualifiers. Regarding the priority of qualifiers: very high. They are the next major UI feature to be deployed, and as far as I can tell from the progress of the team it looks like they will be deployed in April. Cheers, Denny 2013/3/20 Dario Taraborelli dtarabore...@wikimedia.org I disagree, and fully concur with Tom: a generic string type for a datetime qualifier defies the purpose of making wikidata statements well-formed and machine-readable. I don't think we should enforce typing for *all* qualifiers and I second the general organic growth approach, but datetime qualifiers strike me as a fundamental exception. Would you represent geocoordinates as a generic string and wait for organic growth to determine the appropriate datatype? I appreciate the overheads of adding datatype support, but this decision will have a major impact on the shape of collaborative work on wikidata. Denny – on a related note, I wanted to ask you what is the priority of qualifier support relative to the other items you mentioned in your list. As I noted in my previous post, the only way for an editor to correct an outdated statement is to remove information (e.g. Lombardy: head of local government: -Roberto Formigoni +Roberto Maroni ): this information will then be lost forever in an item's revision history. The sooner we introduce basic support for qualifiers, the sooner we can avoid removing valuable information from wikidata entries just for the sake of keeping them up-to-date. Dario On Mar 15, 2013, at 10:09 AM, Michael Hale hale.michael...@live.com wrote: For most of the scenarios I can think of, parsing the dates out of strings that are in a standard format by convention will be much easier. The number of ways people will want to use qualifiers will increase like the number of properties and items. So the way I see it, we have to support string-based qualifiers at the minimum. Then I think we should only support strongly typed qualifiers if performance requires it. By setting an update polling frequency on templates that use the information I don't think we'll run into performance issues for most scenarios. Even with this example the qualifier type is a
Re: [Wikidata-l] Expiration date for data
Hi Dario, two or three features are still missing to enable that (sorted in order we are probably going to deploy them): * qualifiers * the time datatype * statement ranks As soon as they are available, this can be modeled in a way that it can be useful for projects accessing the data. So, progress yet, but it's not there yet :) Cheers, Denny 2013/3/14 Dario Taraborelli dtarabore...@wikimedia.org Has there been any progress on time-based qualifiers since this thread? If so, can someone point me to relevant discussions/proposals? Thanks Dario On Oct 11, 2012, at 8:28 AM, Marco Fleckinger marco.fleckin...@gmail.com wrote: Hi, On 11.10.2012 16:12, Lydia Pintscher wrote: On Thu, Oct 11, 2012 at 11:13 AM,bene...@zedat.fu-berlin.de wrote: Is there something like VALID_FROM and VALID_TO in your Database? LB This is basically what the qualifiers do. http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model_primer has more details. Hm, sorry I didn't remember this. Thank you for reminding! Marco ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] question about Inclusion policy discussion
That is a tough question. We are pretty sure that we technically scale quite well, and there is no reason that the community should restrict itself out of technical reasons. If the number of item suddenly increases by one or two orders of magnitudes, we would probably meet a few hiccups on the way, but the architecture should be able to deal with that. What I am much more worried about is, is the scaling of the community though. One of my statements from my Wikidata talks is we do not want to become the biggest data heap out there, but rather aim for an organic community, that is strong and resilient enough to maintain the data that is being collected. See also Wikidata requirement #6 http://meta.wikimedia.org/wiki/Wikidata/Notes/Requirements (a page worth re-reading). Sometimes it might sense for Wikidata to bridge and connect to external data sources that have their own way of maintenance and curation. Should the dataset really be merged into Wikidata? Is the data wikilike? Is it used in the Wikimedia projects? Or could it be also provided as a linked open dataset, which is referenced from Wikidata? Just to give an example: sure, one could theoretically start to collect temperature data of a city in hourly measurements*, but it could maybe make more sense to point to an external site that collects this data in a more efficient format, provide the mapping identifiers, and allow for a bot to go there and discover the data. Wikidata in turn could provide an aggregation of the data, which indeed would be used on e.g. Wikipedia and Wikivoyage, but leave the full dataset on the external site. (Which, by the way, would also be a viable solutions for datasets which have incompatible licenses). I hope this makes sense, Cheers, Denny * Actually, this kind of data would probably kill us faster than creating many items, as it would make a single item be ginormous. We scale not that well in that direction. 2013/3/14 Benjamin Good ben.mcgee.g...@gmail.com I've been struggling to understand what should go into wikidata and what should not. I see that this is because it hasn't been decided yet ;) http://www.wikidata.org/wiki/Wikidata_talk:Notability In helping the community to make this decision I think it would be really helpful for the developers to weigh in on the technical capacity of the envisioned/realized wikidata infrastructure. If we know how big the system could realistically be and continue to work well technically, it might help discussions about how much and what kind of content we should put into it. If the plan is to cope with only a few tens of millions of subjects that is quite different than if the plan allows for the potential creation of billions of items. (Suggesting less inclusive versus more inclusive policies). ? -Ben ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] [Wikimedia-l] Are there plans for interactions between wikidata and wiktionaries ?
There is currently a number of things going on re the future of Wiktionary. There is, for example, the suggestion to adopt OmegaWiki, which could potentially complicate a Wikibase-Solution in the future (but then again, structured data is often rather easy to transform): http://meta.wikimedia.org/wiki/Requests_for_comment/Adopt_OmegaWiki There is this grant proposal for elaborating the future of Wiktionary, which I consider a potentially smarter first step: http://meta.wikimedia.org/wiki/Grants:IEG/Elaborate_Wikisource_strategic_vision There's this discussion on Wikdiata itself: https://www.wikidata.org/wiki/Wikidata:Wiktionary And I know that Daniel K. is very interested in working into this direction. Personally, I regard Wiktionary as the third priority, following Wikipedia and Commons. A lot of the other projects -- like Wikivoyage or Wikisource -- can be served with only small changes to Wikidata as it is, but both Commons and Wiktionary would require a bit of thought (and here again, Commons much less than Wiktionary). I would appreciate a discussion with the Wiktionary-Communities, and also to make them more aware of the OmegaWiki proposal, the potential of Wikidata for Wiktionary, etc. Just to give a comparison: it took a few months to write the original Wikidata proposal, and it was up for discussion for several months before it was decided and acted upon. I would strongly advise to again choose slow and careful planning over hastened decisions. Cheers, Denny 2013/3/9 Mathieu Stumpf psychosl...@culture-libre.org Hello, First, congratulation for all the already achieved great work on the wikidata project. Now I would be interested to know more about future development, especially on interactions with wiktionaries. I think wikidata could help to improve wiktionaries drastically, by unifying not only interlangs links, but also definitions and translations. More accurately what I mean is that currently you often have, attached to one wiki article you have usually several definitions for each language where the word is used. But often when I seek a non-french word in the french wiktionary, looking at the native wiktionary will bring more definition than what you can find on the french article. I saw that on the english wiktionary, the interface added a quick add feature, which ask user to fill translation for each meaning. That's great and I wish it would be added in all chapters. And I think that we could add even more hey, what about translating just this little thing feature across all dictionary by centralizing entries, so that each word is associated with one or several meaning by language. Then all meanings could be redistributed to all wiktionnaries, even when no translation is available for a given meaning in the local chapter. In this cas we could have an information box that would say this word have an other meaning which wasn't yet translated in ${local_language}, if you one of the language in which a translation is available, please help us to improve the wiktionary. What do think about such a project, could it work with wikidata? kind regards, mathieu ___ Wikimedia-l mailing list wikimedi...@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] One entity per page question
Yes, every Wikidata page is about one and exactly one entity. There cannot be two entities on one page. Bonny and Clyde is one entity, designing the pair of people. Bonny and Clyde each might also be one entity each, and there could be relevant connections between the three entities Bonny, Clyde, and Bonny and Clyde. 2013/3/6 Yuri Astrakhan yuriastrak...@gmail.com Sk!d, of course if you ask for multiple items, you get multiple items. My question is the difference between MediaWiki concept of a page, and the wikidata concept of an entity, specifically relating to items and properties (not queries). Are these concepts interchangeable? Is one MediaWiki page the same as one wikidata item, and have one-to-one mapping? On Wed, Mar 6, 2013 at 9:41 AM, swuensch swuen...@gmail.com wrote: It is wbgetentitie*s* requests like: https://www.wikidata.org/w/api.php?action=wbgetentitiesids=Q219937|Q42format=jsonfmgive you two entities Q219937 and Q42. Sk!d On Wed, Mar 6, 2013 at 3:35 PM, Yuri Astrakhan yuriastrak...@gmail.comwrote: During an IRC discussion, I was told that a page in namespace 0 like Q219937 http://www.wikidata.org/wiki/Q219937 does not necessarily have a one-to-one relationship with an entity like Bonnie and Clyde. wbgetentitieshttp://www.wikidata.org/w/api.php?action=wbgetentitiesids=Q219937format=jsonfm API call gives this: entities: { q219937: { pageid: 214789, ns: 0, title: Q219937, lastrevid: 7969610, modified: 2013-02-27T09:17:25Z, id: q219937, type: item, aliases: { .. How is it possible to have more than one entity in one wiki page titled Q219937, if the entity id is the same as page title? In what cases would it be used? Is that a needed extra complexity? In the case of Bonnie and Clyde (one wikipage in language A vs two wikipages in B), wikidata can have three entities with links to static redirects, apparently solving the need of one-to-many. I am only considering item entities (ns:0), since query pages will obviously have more than one entity associated with them. Thanks! ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- *Severin Wünsch* ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] deployment on enwp delayed
We are still working on a postmortem. As of now, it seems there has been some serious memchached failures and some interplay with another software deployment. 2013/2/13 Jan Kučera kozuc...@gmail.com What were the issues in detail? 2013/2/12 Lydia Pintscher lydia.pintsc...@wikimedia.de On Tue, Feb 12, 2013 at 12:57 PM, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: We'll do another attempt later today. There were unfortunately too many other issues unrelated to Wikidata so we also had to call off this one. Sorry. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Community Communications for Wikidata Wikimedia Deutschland e.V. Obentrautstr. 72 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] phase 1 live on the English Wikipedia
You have examples of that? Did not happen to my edits (so far). 2013/2/13 Denny Vrandečić denny.vrande...@wikimedia.de Block them until they behave? 2013/2/13 Katie Chan k...@ktchan.info On 13/02/2013 21:01, Lydia Pintscher wrote: Heya :) Third time's a charm, right? We're live on the English Wikipedia with phase 1 now \o/ Details are in this blog post: http://blog.wikimedia.de/2013/**02/13/wikidata-live-on-the-** english-wikipediahttp://blog.wikimedia.de/2013/02/13/wikidata-live-on-the-english-wikipedia An FAQ is being worked on at http://meta.wikimedia.org/**wiki/Wikidata/Deployment_**Questionshttp://meta.wikimedia.org/wiki/Wikidata/Deployment_Questions Thanks everyone who helped! :) Now if only those interwiki bots would stop adding links back... KTC -- Experience is a good school but the fees are high. - Heinrich Heine __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Complaint about the partial Phase II deployment
Sven, thank you for your honest opinion, and I know that you are not alone with it - but I also heard a lot of people express excitement and joy about the deployment, and based on the activity it seems that a lot of people like it. We consider ourselves happy to be part of an intelligent and critical, and at the same time sympathetic community. I fully agree that the first deployment of Phase II functionality was early. A lot of features are missing, as we have repeatedly communicated. Was the deployment too early? That I disagree with. It is widely accepted wisdom for software projects to release early, release often. I think that is very valid advice, and I decided to follow it with the project. This allows us to see if some of our basic assumptions work. This allows us to test a few things before we fully commit to them and spend big amounts of effort without any reality check. The project plan as a whole was planned like this. Basically language links are just a test-run for some of the technology we will need in order to implement Phase II. Many back-end features -- the propagation of data from a central repository to the Wikipedias, the way recent changes deals with this, the scalability of some of our assumptions -- are equivalent in Phase I and II. Phase I was always implemented with Phase II in mind. We are doing the same thing now. We implement some features -- namely statements per se, references for statements -- but with a limited set of data sets and with some major limitations. But these will get expanded over time. Compare it with another project like the Visual Editor. It is deployed to the English Wikipedia. Plenty of features are still missing. But only with their current deployment schedule can the VE team gather crucial data for their further development. The main difference is that VE is an opt-in feature -- statements in Wikidata are not, they are there, in your face. I regard a project like Wikidata not as a software development project. It is a growing, living socio-technical system, and in this case actually it is one embedded in an even bigger such system, the Wikimedia movement as a whole. We are developing technical features that we think will lead all of us towards our common goals, and then we watch how the communities adapt to them, which social rules they build on them, which technical developments of their own are built on top of ours. We (as the development team) are part of this ecosystem, and we (as all of us Wikimedians) are growing together. Technical possibilities shape the rules the Wikidata editor community agrees on, and the actual usage of the system and your feedback shapes and prioritizes the future technical development that we plan and undergo as the development team. I also see that some decisions of the community are based on the currently available features, but i do not think that this is problematic -- because I am very confident that future new features will continue to shape new rules and that the existing ones will be revisited and updated accordingly. The timing of the deployment of phase II to wikidata.org has nothing to do with the deployment of phase I to the English Wikipedia, which is currently scheduled for Monday. We simply deployed features when they are deemed ready. We do not plan features ahead with the intention to keep interest high, or in order to win editors from other communities, etc. Also, we regard phase II only as sufficiently finished when it is actually deployed to the Wikipedias. And this, obviously, still requires a much better support for references and a bigger number of data types. Also, so far there is no reason to believe that there will be any major problems revealed once we deploy to the English Wikipedia. I regard your feedback as very important, and I am thankful for it. I understand that everyone would like to have all the features immediately. But I disagree with you on the point that we should not have deployed last Monday. We were working very hard in order to be able to deploy on Monday, and not wait even longer, and we are very proud with how smoothly it went - fully conscious of the limitations of the current state. The situation will soon improve, and we would like to stay a project for now that deploys new features in a comparably quick succession. After this explanation I hope that I have the support of the Wikidata community to continue in this spirit. Cheers, Denny 2013/2/8 Sven Manguard svenmangu...@gmail.com Hello there. I have been an active and vocal supporter of Wikidata since almost the day it went live, and after giving Phase II a legitimate chance, I have to say that in my opinion the decision to deploy Phase II with only a small number of the expected features has been a massive mistake. Yes, I understand that the project was losing momentum and that several people commented that they felt that there was nothing to do on the project before Phase II hit, however the partial
Re: [Wikidata-l] Bug Report for Q17 Japan - Cannot remove the extra language
I assume you mean the unability to remove the second Yen. This is... interesting. We have right now no idea what is going one. Investigating. Thank you for reporting, Denny 2013/2/6 Napoleon Tan napoleon@gmail.com I think the wikidata Item entry for Japan is corrupted. I cannot remove the extra language no matter what. I think the file is somehow corrupted for this page. http://www.wikidata.org/wiki/Q17 ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] first parts of phase 2 live on wikidata.org now
It is a feature. The reason for that is that in the Wikipedias when you access the data you can't use the property names otherwise - they have to be unique. So in order to be able to write {{#property:capital}}, the capital property needs to be unique (and a {{#property:p25}} we considered to be unacceptable usability-wise). That is why we decided that property labels need to be unique (per language). If there are better ways to solve that problem, we are all ears. Cheers, Denny 2013/2/6 Dennis Tobar dennis.to...@gmail.com Hi: As we know, two properties may have the same name. In Spanish we call género to two topics: gender (P21) and genus (P74). The first is related to sex and second to taxonomic categoy. So, if we call both as género, the site doesn't allow it (Edit not allowed: Otra propiedad (21) ya tiene la etiqueta género asociada con el código de idioma es) Is it any bug or feature? Regards On Wed, Feb 6, 2013 at 5:47 AM, Mathieu Stumpf psychosl...@culture-libre.org wrote: Le 2013-02-05 15:58, Lydia Pintscher a écrit : On Tue, Feb 5, 2013 at 3:26 PM, Nicholas Humfrey nicholas.humf...@bbc.co.uk wrote: This is fantastic :) you are making amazingly fast progress! Thank you! I have been trying to assign the 'is a' property to David Cameron: http://www.wikidata.org/wiki/**Q192 http://www.wikidata.org/wiki/Q192 And make him a Politician: http://www.wikidata.org/wiki/**Q82955http://www.wikidata.org/wiki/Q82955 But it doesn't seem to let me select 'Politician' in the value field. How is the list of allowed values defined? I just set this. This is possible. There is no list of allowed values. All existing items are allowed if it is a property of type item. About relation names, 'is a' is vague, isn't it? I mean, Mr. Cameron may have political activities today, and make something else tomorrow, as he may used to do something else before. So wouldn't be interested to give more accurate information, like he have been UK prime minister since 11 may 2010 (and adding information on end date of phenomena when possible, which is not the case here). And then you may add prime minister in a political role category. Now it all depends on granularity wikidata is aiming to. Cheers Mathieu -- Association Culture-Libre http://www.culture-libre.org/ __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Dennis Tobar Calderón Ingeniero en Informática UTEM ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Reification in Wikidata serialisation
We are reluctant, but open, to renaming it. But not to Fact. Statement has the nice ambiguous quality regarding its correctness which Fact lacks. On the other hand, the similarity to rdf:Statement is not merely syntactic, so I do not see too much of an issue here. 2013/2/1 Nicholas Humfrey nicholas.humf...@bbc.co.uk Hello, My colleague Yves Raimond and myself were just having a quick chat about the Wikidata RDF serialisation plans. http://meta.wikimedia.org/wiki/Wikidata/Development/RDF While the reification makes sense, we thought that it looked a bit too much like rdf:Statement. w:Berlin s:Population Berlin:Statement1 . Berlin:Statement1 rdf:type o:Statement . Perhaps you could rename o:Statement to o:Fact instead? nick. - http://www.bbc.co.uk This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. - ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] wikidata stats (was: getting some stats for the Hungarian Wikipedia)
No, not by design. The design would be to have http://en.wikidata.org/wiki/Breck be an alias for http://www.wikidata.org/wiki/Q123803 but we didn't have yet the time to set this up properly. If anyone knows Apache well and has some time on their hands, please ping me on IRC. Right now, they are all identical, but that's not as planned. Only the one without the language code is supposed to be canonical. Re swuensch: it does not have to do with the viewerlanguage. Re first mail: is there not a plain wd as well (for www.wikidata?) or a www.wd? Should check myself :) Cheers, Denny 2013/2/1 Ed Summers e...@pobox.com Yes, I'm just noticing that there are, for example: http://de.wikidata.org/wiki/Q123803 http://en.wikidata.org/wiki/Q123803 http://www.wikidata.org/wiki/Q123803 Which are identical. Is this by design? //Ed On Fri, Feb 1, 2013 at 5:15 AM, swuensch swuen...@gmail.com wrote: maybe the statistic is splitted up by the viewerlanguage. On Fri, Feb 1, 2013 at 11:11 AM, Ed Summers e...@pobox.com wrote: Diederek van Liere over on the analytics list [1] let me know that webstatscollector was updated to start collecting wikidata stats as of Feb 1st UTC 0:00. Yay. I took a look, and I was a bit confused by the language prefixes, for example: de.wd Wikidata:Hauptseite 14 369565 de.wd Wikidata:Introduction/de 1 48357 de.wd Wikidata:Labels_and_descriptions_task_force/de 1 39804 de.wd Wikidata:Translation_administrators/ca 1 11902 dsb.wd Q54919 1 12281 dsb.wd Special:SetLabel/q54919/en 1 dsb.wd Special:WhatLinksHere/Q54919 1 9607 el.wd File:Wikidata_item_creation_progress.png 1 13545 el.wd Wikidata:Community_portal 1 11816 en.wd Q123801 1 10801 en.wd Q123803 1 10908 en.wd Q123843 1 10872 en.wd Q124027 1 11312 en.wd Q124345 1 11156 en.wd Q14217 1 11485 Does that make sense to anyone? I thought there was just www.wikidata.org? If it doesn't make sense let me know and I will follow up with Diederek. //Ed [1] http://lists.wikimedia.org/pipermail/analytics/2013-February/000388.html ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Severin Wünsch ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Fwd: Todos for RDF export
2013/1/25 Daniel Kinzler daniel.kinz...@wikimedia.de Hi! I thought about the RDF export a bit, and I think we should break this up into several steps for better tracking. Here is what I think needs to be done: Daniel, I am answering to Wikidata-l, and adding Tpt (since he started working on something similar), hoping to get more input on the open list. I especially hope that Markus and maybe Jeroen can provide insight from the experience with Semantic MediaWiki. Just to reiterate internally: in my opinion we should learn from the experience that SMW made here, but we should not immediately try to create common code for this case. First step should be to create something that works for Wikibase, and then analyze if we can refactor some code on both Wikibase and SMW and then have a common library that both build on. This will give us two running systems that can be tested against while refactoring. But starting the other way around -- designing a common library, developing it for both Wikibase and SMW, while keeping SMW's constraints in mind -- will be much more expensive in terms of resources. I guess we agree on the end result -- share as much code as possible. But please let us not *start* with that goal, but rather aim first at the goal Get an RDF export for Wikidata. (This is especially true because of the fact that Wikibase is basically reified all the way through, something SMW does not have to deal with). In Semantic MediaWiki, the relevant parts of the code are (if I get it right): SMWSemanticData is roughly what we call Wikibase::Entity includes/export/SMW_ExportController.php - SMWExportController - main object responsible for creating serializations. Used for configuration, and then calls the SMWExporter on the relevant data (which it collects itself) and applies the defined SMWSerializer on the returned SMWExpData. includes/export/SMW_Exporter.php - SMWExporter - takes a SMWSemanticData object and returns a SMWExpData object, which is optimized for being exported includes/export/SMW_Exp_Data.php - SMWExpData - holds the data that is needed for export includes/export/SMW_Exp_Element.php - several classes used to represent the data in SMWExpData. Note that there is some interesting interplay happening with DataItems and DataValues here. includes/export/SMW_Serializer.php - SMWSerializer - abstract class for different serializers includes/export/SMW_Serializer_RDFXML.php - SMWRDFXMLSerializer - responsible to create the RDF/XML serialization includes/export/SWM_Serializer_Turtle.php - SMWTurtleSerializer - responsible to create the Turtle serialization special/URIResolver/SMW_SpecialURIResolver.php - SMWURIResolver - Special page that deals with content negotiation. special/Export/SMW_SpecialOWLExport.php - SMWSpecialOWLExport - Special page that serializes a single item. maintenance/SMW_dumpRDF.php - calling the serialization code to create a dump of the whole wiki, or of certain entity types. Basically configures a SMWExportController and let's it do its job. There are some smart ideas in the way that the ExportController and Exporter are being called by both the dump script as well as the single item serializer, and that allow it to scale to almost any size. Remember that unlike SMW, Wikibase contains mostly reified knowledge. Here is the spec of how to translate the internal Wikibase representation to RDF: http://meta.wikimedia.org/wiki/Wikidata/Development/RDF The other major influence is obviously the MediaWiki API, with its (almost) clean separation of results and serialization formats. Whereas we can also get inspired here, the issue is that RDF is a graph based model and the MediaWiki API is really built for a tree. Therefore I am afraid that we cannot reuse much here. Note that this does not mean that the API can not be used to access the data about entities, but merely that the API answers with tree-based objects, most prominently the JSON objects described here: http://meta.wikimedia.org/wiki/Wikidata/Data_model/JSON So, after this lengthy prelude, let's get to the Todos that Daniel suggests: * A low-level serializer for RDF triples, with namespace support. Would be nice if it had support for different forms of output (xml, n3, etc). I suppose we can just use an existing one, but it needs to be found and tried. Re reuse: the thing is that to the best of my knowledge PHP RDF packages are quite heavyweight (because they also contain parsers, not just serializers, and often enough SPARQL processors and support for blank nodes etc.), and it is rare that they support the kind of high-throughput streaming that we would require for the complete dump (i.e. there is obviously no point of first setting all triples into a graph model and then call the model-serialize() method, this needs too much memory). Also some optimizations that we can use (re ordering of triples, use of namespaces, some assumptions about the whole dump, etc.). I will ask the Semantic Web
Re: [Wikidata-l] Coordinate datatype -- update
Exactly. This is about the backend and the API. The user will rather use a Widget maybe similar to this one: http://localhost/~denny_WMDE/valueparser/time.html 2013/1/17 Luca Martinelli martinellil...@gmail.com 2013/1/17 Denny Vrandečić denny.vrande...@wikimedia.de: Based on the feedback so far I have frozen the datatype for time [1] and updated the datatype for coordinates. ***CUT*** Sorry if my question appears silly, but I'll take the risk. I assume this deals with how the system recognizes the data we put in, and not with how the user puts the data into the system, am I right? In other words, will I/we be forced to use THIS way of inserting datas, or we'll put them they way we know/can and then the system will recalculate them in this way? -- Luca Sannita Martinelli http://it.wikipedia.org/wiki/Utente:Sannita ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Coordinate datatype -- update
I heard this URL might be better than the previous on most setups: http://simia.net/valueparser/time.html 2013/1/17 Denny Vrandečić denny.vrande...@wikimedia.de Exactly. This is about the backend and the API. The user will rather use a Widget maybe similar to this one: http://localhost/~denny_WMDE/valueparser/time.html 2013/1/17 Luca Martinelli martinellil...@gmail.com 2013/1/17 Denny Vrandečić denny.vrande...@wikimedia.de: Based on the feedback so far I have frozen the datatype for time [1] and updated the datatype for coordinates. ***CUT*** Sorry if my question appears silly, but I'll take the risk. I assume this deals with how the system recognizes the data we put in, and not with how the user puts the data into the system, am I right? In other words, will I/we be forced to use THIS way of inserting datas, or we'll put them they way we know/can and then the system will recalculate them in this way? -- Luca Sannita Martinelli http://it.wikipedia.org/wiki/Utente:Sannita ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Is the inclusion syntax powerful enough?
Hi all, We did yet another version of the inclusion syntax (admittedly, the last one is a few months old). We decided to very much simplify the syntax and also the ability of the inclusion syntax, and depend on anything more complex than what will be possible with that syntax on Lua. I am expecting that a lot of people will not like this idea. Therefore I will start two threads: first this one, where we discuss if the inclusion syntax is actually sufficient or if it lacks absolutely essential features that should not depend on Lua. http://meta.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax_v0.3 And a second thread where we discuss the details and merits of the proposal as it is, and if there are issues with it as it is. Cheers, Denny P.S.: yes, I learned that we should not have too many topics per thread starter. Let's see if this gets any feedback at all :) -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Update to time and space model
2013/1/8 Gregor Hagedorn g.m.haged...@gmail.com ON COORDINATES: a) what you describe is more specific than a geolocation (which may be expressed by other means than coordinates). I suggest to give the data type the more specific name: geocoordinates Yep, agreed. Or just coordinates. b) with respect to precision: I don't understand the reasoning to stick this to degrees. Since we are describing locations on an ellipsoid, the longitude to distance and latitude to distance conversions are different, and they are different for different points on earth. See example on en.wikipedia, a minute at equator is 1843 versus 1855 m. The model defines it as using the arcdistance on the given equator. In practice the potential location error will be given in a distance measure. You want to convert it to degrees in a highly complex conversion. Why? The back conversion will usually be non-ambiguous (since the backconversion will always describe an ellipsis rather than a circle). In practice the value will be given as 44°15'. Then we know it is by the minute - and not that it is given by a nautical mile. I am not making a highly complex conversion -- I am just looking at the number and saying oh yeah, this seems to be given by the minute, and not by the second or by the degree. The reason why I prefer degrees on a given equator to meters is that it makes more sense on varying globes, like the Earth, Moon, Sun, Jupiter, and Phobos. What we need is the possibility to understand that 44°15' should not be displayed as 44°15'00.001 the next time the value is displayed. And by saying it is correct by the minute allows us to do so. Making the statement in meters would actually require us to make that complex calculation which would be based on the given geodetic system -- which is much more complicated than the current suggestion. c) Furthermore, as before, I believe that precision and accuracy will usually both contribute to the error your are interested in and which is typically described in geolocations having a +/- addition. I suggest to replace precision with errorradius or uncertaintyradius or uncertaintyInMeters which would be the great circle distance. To somewhat simplify, the unit could be fixed to m. I think precision is actually what I mean here for geocoordinates: with how much precision is the coordinate given? How many 0s after the dot need to be written? Is the minute specified or not? Is the second specified? This can be used for transforming from one geodesic system to the other, or, simpler, from degree minute seconds to degree in decimals. But then again, I don't mind calling it uncertainty or uncertaintyRadius. Here is some work done in our area (biodiversity): http://code.google.com/p/darwincore/wiki/Location The term there is http://terms.gbif.org/wiki/dwc:coordinateUncertaintyInMeters Yep, pretty much what I meant, just that I am suggesting not to use meters but something that is easier to translate into degrees. d) the correct name for globe is Geodetic datum or geodetic system (which is more than the globe). See http://en.wikipedia.org/wiki/Geodetic_system or http://terms.gbif.org/wiki/dwc:geodeticDatum. WGS 84 (as a wikidata item) is a valid geodetic datum or system. Both terms are equally correct. Globe is not correct. OK. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Update to time and space model
Thanks to the pointer, Katie. I meant to look into Max' work for a while, but failed. Now I did and asked him many questions :) So the biggest difference is that Max uses dim to represent what we mean here with precision. And dim is somehow related to precision in a globe dependent way (which is fine) -- or differently put, dim is what Gregor would prefer here, since it is measured in Meter. Otherwise it looks pretty much the same. I still would prefer Arcdegree of the equator of the given globe over Meter, as it allows to measure any globe without having too much details about the globe. but otherwise it seems like the same things. (And they can be transformed from one to the other using a simple factor). 2013/1/8 Katie Filbert katie.filb...@wikimedia.de On Tue, Jan 8, 2013 at 1:54 PM, Nikola Smolenski smole...@eunet.rswrote: On 08/01/13 12:36, Denny Vrandečić wrote: Location: https://meta.wikimedia.org/**wiki/Wikidata/Development/** Representing_values#**Geolocationhttps://meta.wikimedia.org/wiki/Wikidata/Development/Representing_values#Geolocation I'm not sure if we should be going that far, but there may be cases where longitude and latitude are known with different degree of accuracy, so multiple precisions might be needed. I think it's worth taking a look at what MaxSem has done with the GeoData extension, which is used for mobile apps, etc.: https://www.mediawiki.org/wiki/Extension:GeoData GeoData uses globe, as that's consistent with how coordinate templates are done now on Wikipedia. I think starting simple and consistent with GeoData and the coordinate templates is good. If no globe parameter is specified in the coordinate template, then Earth is assumed (and lat/long -- WGS84). For the moon, selenographic coordinates are assumed and there are other reference globes for other planets and moons. http://en.wikipedia.org/wiki/Selenographic_coordinate Perhaps things can get more complex later and having WGS84 coordinates wouldn't interfere with that. Cheers, Katie __**_ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Katie Filbert Wikidata Developer Wikimedia Germany e.V. | NEW: Obentrautstr. 72 | 10963 Berlin Phone (030) 219 158 26-0 http://wikimedia.de Wikimedia Germany - Society for the Promotion of free knowledge eV Entered in the register of Amtsgericht Berlin-Charlottenburg under the number 23 855 as recognized as charitable by the Inland Revenue for corporations I Berlin, tax number 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l