Re: [Wikidata-l] Tree Of Life
Thanks Lydia, agreed, the tool absolutely is useful and many thanks to you, Lucie and your team, for providing it However, it is difficult for me to say what needs to be resolved on the tools level and what on Wikidata content. What I observe looks to me (but I may be wrong) a mashup-error occurring when uncritically combining data from different sources, each source being internally consistent and correct. This may be something that needs to be addressed by the tool. The problem with different labels and singular/plural may or may not be a tool problem in choosing the correct label One problem I have when using the tool to try to understand what is going on: It should variously wikidata and wikipedia pages, making it somewhat difficult to follow what goes on. E.g. Biota is a wikidata page, bacteria is a wikipedia page... So again thanks for doing this great work! gregor On 18 December 2014 at 20:52, Lydia Pintscher lydia.pintsc...@wikimedia.de wrote: On Thu, Dec 18, 2014 at 8:38 PM, Gregor Hagedorn g.m.haged...@gmail.com wrote: https://tools.wmflabs.org/tree-of-life/ is problematic at first looks. Bacteria, prokaryotes, monera and eukaryoates as sister groups on the same level? Prokaryotes contain as only subtaxon Archaea (but no bacteria)? Also, the mixed use of scientific and common names (variously as singular or plural) is rather confusing. sorry for the critique... Hey Gregor, That's exactly why Lucie made the tool ;-) It is only a reflection of the data in Wikidata. So if it is wrong in the tree it is wrong in Wikidata and should be fixed. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- --- Dr. Gregor Hagedorn Head of Digital World and Information Science Museum für Naturkunde Berlin Leibniz-Institut für Evolutions- und Biodiversitätsforschung Invalidenstrasse 43, 10115 Berlin +49 (0)30 2093 8576 (work) +49-(0)30-831 5785 (private) gregor.haged...@mfn-berlin.de http://www.naturkundemuseum-berlin.de http://linkedin.com/in/gregorhagedorn This communication, together with any attachments, is intended only for the person(s) to whom it is addressed. Redistributing or publishing it without permission may be a violation of copyright or privacy rights. Halloween = Christmas? 31 Oct = 25 Dec! ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Tree Of Life
may be wrong) a mashup-error occurring when uncritically combining data from different sources, each source being internally consistent and correct. This may be something that needs to be addressed by the tool. Good point, yes. Any ideas how? Perhaps just help understanding this better, by displaying the sources for the relations shown The problem with different labels and singular/plural may or may not be a tool problem in choosing the correct label It just takes the label of the item. There is a taxon name property. Would that be a useful alternative? Input from the experts needed. That seems helpful. One way would be to always show taxon name first, followed by the - at the moment perhaps inconsistent - language label in Parentheses Right. It shows the Wikipedia article if there is one in the language you selected and otherwise Wikidata. Better to always show Wikidata? perhaps iframe with the mobile-view as it is PLUS 2 normal (quick-) links below, allowing to open the normal-view pages to conveniently analyse/ read etc. ... not sure. Situation at the moment is that when I tried to understand the source of the mashup error, I could not do it with the information or links displays thanks again gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Tree Of Life
https://tools.wmflabs.org/tree-of-life/ is problematic at first looks. Bacteria, prokaryotes, monera and eukaryoates as sister groups on the same level? Prokaryotes contain as only subtaxon Archaea (but no bacteria)? Also, the mixed use of scientific and common names (variously as singular or plural) is rather confusing. sorry for the critique... gregor On 18 December 2014 at 17:35, Lucie Kaffee lucie.kaf...@wikimedia.de wrote: Hey, Since the Tree of Life by Denny is outdated, we thought it was a nice idea, to have a new one, to have an overview over the biological taxonomy on wikidata. Not only to have a nice looking tree but also to see, where errors are and to correct and update it. Right now, there are 660775 Items in the tree. Even though the change of names can be seen instantly, because this is based on the API, changes in the order need an update of the whole tree, because it's based on wikidata dumps. (This one on the most recent one from 15.12.2014) Here you go, this is the new tree of life, made with a lot of love: https://tools.wmflabs.org/tree-of-life/ If you have any corrections, additions or features you want to add, feel free to ping me, send me a mail, submit a patch or file an issue on github. The repo for the tree is on https://github.com/frimelle/tree-of-life Cheers, Lucie (frimelle) ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l -- - Dr. G. Hagedorn +49 (0)30 2093 8576 (work) +49-(0)30-831 5785 (private) gregor.haged...@mfn-berlin.de http://www.linkedin.com/in/gregorhagedorn This communication, together with any attachments, is made entirely on my own behalf and in no way should be deemed to express official positions of my employer. It is intended only for the person(s) to whom it is addressed. Redistributing or publishing it without permission may be a violation of copyright or privacy rights. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] entity vs Special:EntityData
But the problem seems to be that http://www.wikidata.org/wiki/Q1000 actually seems to be an information resource, i.e. under this URI html content is directly being returned (rather than being http 303 redirected). Also, checking with http://validator.linkeddata.org/vapour I received an error about invalid response (I am not sure whether this is a problem with Vapour or Wikidata ...) Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] entity vs Special:EntityData
But the problem seems to be that http://www.wikidata.org/wiki/Q1000 actually seems to be an information resource, i.e. under this URI html content is directly being returned (rather than being http 303 redirected). Also, checking with http://validator.linkeddata.org/vapour I received an error about invalid response (I am not sure whether this is a problem with Vapour or Wikidata ...) Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
This is great, but the solution I saw (i.e. {{#property:population|current-value=30900}}) makes the whole Wikidata absolutely useless. (I asked Luca back about this, and perhaps one point is that the term current is too easily misunderstood. The point is not that wikidata should have such a property, but that the value at the time of saving a past version of a wikipedia page is preserved. Perhaps {{#property:population|value-when-saving-page=30900}} would be less easily misunderstood. nothing in Wikidata would be made useless by this, it would work exactly like now when calling the current page, it would only work differently when calling the cite-thiy-version-of-a-page links. And it would allow a wikipedia community to structure their work such that Wikipedia editors can still curate and see changes to the values. However, as said above, this is just an example of a solution. It is safe, very small processing overhead, small storage overhead and scales well to load. A more elegant solution would clearly be to do two things: a) when creating a Wikipedia diff on the Wikipedia page version history, to either show directly, or link to a Wikidata property diff (reduced to the relevant parts as outlined in an earlier mail) in addition to the wikitext diff of the page. Note that it is not necessary to merge all Wikidata versions into the Wikipedia version. When comparing two arbitraty Wikipedia page version, it is irrelvant whether 1 or multiple Wikidata changes are included, all corresponding changes should be shown on request. The only necessary item is a single Wikidata indicator (operating like a special version line) on top for cases where Wikidata properties are changed after the last Wikipedia edit. b) expand the property function such that for all calls of specific (citable) page versions, it retrieves the property rendering at that point in time from wikidata. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
On 4 April 2013 22:23, Michael Hale hale.michael...@live.com wrote: trench, then I just want to update it on Wikidata, and then every article that references it will be updated. I don't want to have to update it on Wikidata and then go do a null edit on every article that uses that information. You are correct, the current version would have to be an exception, and display under the current time rules just as in the implementation. My proposal only makes sense when versions from the history are being displayed. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
My concern is, that the Wikidata editors, those not with random editing behavior, but those who are curators/caretakers of specific pages, experience a disempowerment, because they loose control. I view the decision to inform about wikidata changes only in the short-lived recentchanges, but not in the page history, as problematic. Page editors will now be informed that the page has changes, but this change is not recorded in the page history, and it cannot be seen in the version-diffs. This is breaking a lot of assumptions of trust. Wikipedia can be be collaborative because of this trust in the versioning system and because of the accessibility with reasonable, of the version-diffs (transparency). Some editors will probably leave the Wikipedia project due to the introduction of Wikidata, no matter how much Wikidata reaches out to them. I feel that the number is much higher in the present disempowerment implementation, which is why I try to argue here for making content changes that come from Wikidata and affect Wikipedia pages transparent on Wikipedia, not only Wikidata. This discussion is about proposing potential elements and ideas; there may be much better ideas. I am not convinced by the arguments against the proposed means: I fear the thinking is a programmers thinking, not a content editor thinking. Denny, I feel that your proposal that some html-version archiving somewhere, which is not integrated into the wikipedia editing workflow, does not take sufficient care of the needs of the editors, especially the need to be able to use the version comparison, not just find rendered versions somewhere in isolation. But neither of us can see into the future. I think Wikidata is a great achievement as it is, and we all agree that it can be made better by better integration into existing Wikipedia workflows. Let us focus on the importance of this and try to find the best means that are achievable with existing resources. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
On 5 April 2013 20:05, Michael Hale hale.michael...@live.com wrote: The thing to remember is that the history of a page is the history of the wiki markup for the page, not the history of the rendered HTML. It would be misleading if edits were shown in the markup history for an article each time a template or Wikidata item changed because reverting the markup to that version wouldn't actually revert the change. I think what curators with specific specialties want is the ability to automatically expand their watchlist to include all templates and data items that could affect their watched pages. Then a way to view the merged watchlists from multiple projects would be helpful. There is room for improvement in global account integration. For example, I just noticed that I need to set my timezone on Wikipedia and Wikidata independently. I partly agree, the ideal situation is that a) changes of wikidata (and perhaps templates, and perhaps images, with decreasing necessity in practice) show in the page history b) in the diff, such changes are shown separately from the changes of the wikitext itself, but with the same action. This can be achieved by showing the affected changes after a separation line below the wikitext diff. However, since this was rejected previsously as undoable, the expansion of {{#property: to include the current value would be a work-around. We perhaps disagree about the priorities. I believe Wikipedia editors are not primarily keen on the technical definition of the diff as the changes of the wikitext of the database. I believe they want and need transparency about when an who changed a specific topic they care about. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
On 5 April 2013 21:41, Michael Hale hale.michael...@live.com wrote: Well, I could make a view that shows the diff of a Wikipedia article stacked on top of the diff for the corresponding Wikidata item on my computer in a few minutes. But diffs can be very long sometimes, so there would be a lot of discussion about whether that view is more appropriate than just making it easier to find the link to the Wikidata item on the diff and edit pages for articles that include Wikidata properties. But the work-around you suggest is not a work-around. If the U.S. article currently says: The population of the U.S. is 309,000,000 people. And I change it to say The population of the U.S. is {{#property:population|current-value=30900}} people. Which scenarios does that improve? It makes changes visible in the present wikitext diff and allows to render the past version easily to a correct value. It is not meant as a perfect solution to all, but as an option for an editing community to enable them to chose between keeping the information visible and traceable in the Wikidata diff. I agree about the disadvantages of clutter, but the point is that is does allow a community to choose that they don't like clutter and don't need the history and diffs, and simple create the property functions inside the templates (with no tracking). This would indeed show the changes in the present diff and it would allow several years into the future to reasonably understand (in wikitext) and render (as html) previous states of wikipedia articles several years into the past. However, better solutions are certainly possible. Based on what you write I can imagine an editing workflow diff similar to the stacking of diffs, but actually reduced to a link pointing to Wikidata. The features I view as important are: 1. In the history, Wikidata changes for a topic are made visiable durign the the display of Wikipedia-Wikitext changes. Ideally multiple Wikidata edits could be merged into a single line if no Wikitext changes occur in between. There could be options to hide Wikidata changes. The how is not so important, but I think the present watchlist implementation is insufficient, because it generates an attention message, but makes it hard to follow up (which usually occurs in the page history + diff). 2. In the actual diff, instead of stacking the Wikitext + Wikidata property diffs, I could imagine that a solution that at the bottom says: Associated Wikidata properties changed in the choosen period. Where choosen period is the period chosen for the diff by the editor and where the whole is linked to a Wikidata diff. Present Wikipedia communication practice heavily relies on pointing to specific history diffs (through links), but currently Wikidata changes are completely invisible there. By automatically linking them in present practice could continue smoothly and non-disruptive. 3. Ideally, the Wikidata diff link should have option to hide the Wikidata internal properties like changes in item labels, and should show language specific changes only for a specific language, to keep the attention of content editors focussed on the relevant changes (most likely Wikidata will still show more properties than those used on the wikipedia page, but this could be acceptable). The question of rendering the html for past versions is separate. You seem to say that it is already easy to write the #property: function such that it takes into account the edit timestamp of a wikipedia page version and evaluates the property as it was at that point in time (with the current version (ie. when called without a pageid) always evaluating to now(), not the last editing timestamp. ASIDE: I don't worry too much if a property is being referenced by name or ID in a past version and has since been deleted, the resulting error message is transparent rather than misleading (which the display of wrong information is). Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
On 5 April 2013 23:19, Michael Hale hale.michael...@live.com wrote: So you agree that it is more important to reduce clutter than to add functionality that very few people use? No, I strongly disagree with this. I think the functionality of being able to curate the page Wikipedia editors care for should have highest priority. I believe it is very important to make Wikidata palatable to the people Wikipedia depends upon. Reducing clutter is nice, but avoiding to loose transparency and trust is essential. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
On 5 April 2013 23:53, Michael Hale hale.michael...@live.com wrote: But you do agree that it is easier to curate articles by updating one value in a database than updating the value separately everywhere it appears? Absolutely. But in my experience Wikipedia editors care about the product of a readable, intelligable, correct encyclopedic article that others enjoy to read. People that are able to care about individual properties in Wikidata are rare. Wikidata needs the coupling between Wikipedia editors and Wikidata curation. The editors should be supported, not alienated by giving them the feeling that it becomes unmanageable for them to follow the changes (because of workflow separation, because of too many insigificant changes (like label changes in any number of languages that the average editor is unable to read). I view support of Wikipedia editor workflow, for which the implemented change notification in recentchanges/watchlist is an important first step, with support of change transparency discussed here a second step, as an important piece in the whole puzzle. (Not as the only important thing, don't get me wrong :-) ) Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
when templates (or, in the case of wikidata, properties) get deleted or renamed. Nobody has come up with a good solution yet. I think we did discuss a simple, working solution: Saving the value together with the Wikipedia page. The major argument against that was: it is a waste of storage to create a new Wikipedia page (perhaps daily) when property values included in a page are changed in Wikidata. I personally value trust and documentation of change much higher than disk storage, but even then, there are ways to balance this. So perhaps a modified proposal that matches the current development stage: If an editor saves a page with {{#property:population}} the parser looks up the current value and changes this to: {{#property:population|current value=2348732}} and stores this wikitext version in the Wikipedia. The same would apply to updating, saving {{#property:population|current value=2348732}} may result in {{#property:population|current value=2348700}} being saved. This would mean no additional waste of storage for articles that are regularly changed. For those that are not, one could imagine a bot-based monthly update check to make past knowledge transparent. I realize that this would require a pattern, where the Wikidata-derived values would remain editable on the topic/article pages, i.e. the property function would have to be inserted in the template call, rather than in the template definition. Those wikidata properties automatically called inside templates with a dynamic item decided by the current template call would not be preserved. However, both editing patterns would be available and it would be up to the community of each Wikipedia to choose the preferred one. (As I said previously: although similar to the issue of commons images and templates, the issue at stake for Wikidata is different. Because of the problems in preserving a transparent editing history, updates to commons images are generally restricted to truly minor improvements (contrast, cropping, better resolution, etc.). I am not aware of cases, where commons images regularly are replaced with updated content that is different in substance and thus automatically changes all Wikipedia pages, representing different knowledge. I don't want to exclude this, but even for changing company logos the usual solution is to create a new name, preserving the old logo. Similarly, templates may fail to work in old versions (big problem!), but I am not aware that a template would render out-of-time information when viewing a past revision. Thus, the problem of Wikidata with respect to endangering the trust basis of Wikipedia, the version system, is related, but different). Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Page history and properties
don't see what value we'd gain from storing that extra metadata. Every scenario I can think of where you care about past states of the database is already handled by the compare selected revisions feature. If that is so simple, can the {{#property:xxx}} call in a wikipedia simply resolve to the revision that was valid at the point in time equivalent to a given revision? It seem like you say you already have the code to do that when creating the wikidata item description. I disgree that this is an issue for mediawiki core, since it is a question of how the Wikidata-specific property function works. Gregor PS: I admit that Denny has found an example to where an image seems to be changing in content on commons, but I still believe this is a rare case. Any wiki-statistician that can supply exact number for these cases? ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] some good news about the future of Wikidata
2013/2/20 Lydia Pintscher lydia.pintsc...@wikimedia.de: http://blog.wikimedia.de/2013/02/20/the-future-of-wikidata/ Very good, congratulations! With great respect for your work, Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Fwd: Todos for RDF export
Some of our insights into the SMW RDF export (which we found to be difficult to configure and use): 1. Probably most relevant: total lack of support for xml:lang, which would have been essential to our purposes. Wikidata should be planned with support for language in mind. 2. We also found that we had serious problems with managing structure, e.g. record and subobject. Due to the need to obtain this information recursively by repeated calls, and because there is no control on the URI created for these calls, some easy solutions like applying clean-up xslt will not work. This may not be relevant for wikidata. 3. At first the lack of variable datatype (datatype is fixed per property) is acceptable. However, we found this a major problem with respect to the forced distinction between datatype:wiki-page and datatype:global URI properties. Essentially, SMW forces one to introduce for a semantic property (e.g. dc:creator) two distinct dummy properties: property:creator_page and property:creator_uri. Since in RDF export the artificial distinction between pages and URIs disappears, it would be desirable to merge them, but only one of them can be set to an imported vocabulary. I think this may be relevant to wikidata, where a similar distinction between properties pointing to a local wikidata item and a global resource exists. Gregor (PS: If any of the problems above in reality does not exist in SMW and we simply overlooked the solution, I am very happy for corrections, of course!) -- - Dr. G. Hagedorn +49-(0)30-8304 2220 (work) +49-(0)30-831 5785 (private) http://www.linkedin.com/in/gregorhagedorn https://profiles.google.com/g.m.hagedorn/about This communication, together with any attachments, is made entirely on my own behalf and in no way should be deemed to express official positions of my employer. It is intended only for the person(s) to whom it is addressed. Redistributing or publishing it without permission may be a violation of copyright or privacy rights. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Coordinate datatype -- update
In order not to loose the Dim-data that is already available from the Wikipedias, and to use this for scaling. It should really only describe the rough dimension. I would expect that a building would still have something like area or similar in its own property. Dimension is used for scaling and uncertainty. Dimension of a building, locality, etc. is well understood, it is the size of location, without respect to shape. Location uncertainty is well understood. Confounding the two seems to introduce the inability of interpreting the information afterwards for many downstream processing cases. I would plead to support both dimension and uncertainty or none. --gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Is the inclusion syntax powerful enough?
I like it. For multilingual wikis like commons a presumably fairly simple extension it might be valuable to support besides #property also #propertylabel and #itemlabel, only with the id and of parameter. I think this does not really to complexity much. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Update to time and space model
ON COORDINATES: a) what you describe is more specific than a geolocation (which may be expressed by other means than coordinates). I suggest to give the data type the more specific name: geocoordinates b) with respect to precision: I don't understand the reasoning to stick this to degrees. Since we are describing locations on an ellipsoid, the longitude to distance and latitude to distance conversions are different, and they are different for different points on earth. See example on en.wikipedia, a minute at equator is 1843 versus 1855 m. In practice the potential location error will be given in a distance measure. You want to convert it to degrees in a highly complex conversion. Why? The back conversion will usually be non-ambiguous (since the backconversion will always describe an ellipsis rather than a circle). c) Furthermore, as before, I believe that precision and accuracy will usually both contribute to the error your are interested in and which is typically described in geolocations having a +/- addition. I suggest to replace precision with errorradius or uncertaintyradius or uncertaintyInMeters which would be the great circle distance. To somewhat simplify, the unit could be fixed to m. Here is some work done in our area (biodiversity): http://code.google.com/p/darwincore/wiki/Location The term there is http://terms.gbif.org/wiki/dwc:coordinateUncertaintyInMeters d) the correct name for globe is Geodetic datum or geodetic system (which is more than the globe). See http://en.wikipedia.org/wiki/Geodetic_system or http://terms.gbif.org/wiki/dwc:geodeticDatum. WGS 84 (as a wikidata item) is a valid geodetic datum or system. Both terms are equally correct. Globe is not correct. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Update to time and space model
geocoordinates Yep, agreed. Or just coordinates. yes, probably better without a geo if it shall work for moon or mars as well. However, http://en.wikipedia.org/wiki/Coordinate_system is far broader term. But I cannot find a correct superclass term for Geographic/Selenographic/Martiographic(?) coordinates. http://en.wikipedia.org/wiki/Spherical_coordinate_system may be close, but its beyond my competence to decide. Coordinates should be pragmatic enough. Or LocationCoordinates. In practice the value will be given as 44°15'. Then we know it is by the minute - and not that it is given by a nautical mile. I am not making a highly complex conversion -- I am just looking at the number and saying oh yeah, this seems to be given by the minute, and not by the second or by the degree. The reason why I prefer degrees on a given equator to meters is that it makes more sense on varying globes, like the Earth, Moon, Sun, Jupiter, and Phobos. What we need is the possibility to understand that 44°15' should not be displayed as 44°15'00.001 the next time the value is displayed. And by saying it is correct by the minute allows us to do so. Making the statement in meters would actually require us to make that complex calculation which would be based on the given geodetic system -- which is much more complicated than the current suggestion. you try to solve the problem of reproducing the precision of the number as entered. However, the proposed mechanism is a mechanism of uncertainty, which is far more general, able to express the uncertainty radius that is due to e.g. specific GPS technologies. When reading the proposal, I did not even understand your narrower intention in your proposal. I believe it does not work to simple use a an equator based distance-in-m to degree conversion. See http://commons.wikimedia.org/wiki/Commons:Geocoding#Precision for examples how this changes in moderate latitudes, not to speak of being near the pole. My Conclusions: a) the model may be able to express the number of digits in degree-minute-second, decimal degrees, and degree-decimal-minutes. I believe, however, that it is yet underdefined. The value in precision necessary to specify whether a decimal-degree-stored-value is to be reproduced as 44°15', 44°15'15'', 44°15'15'.15', 44°15'15'.15', 44°15.1515, or , 44.151515 ° (which are different example values of course, not just different precision) is unclear to me. Latitude and longitude to 4-5 significant digits mean different precision in meters, but it is customary to give the same precision. b) the model may unduly suggests it can be used for arbitrary reasons of precision. However, it can not ALSO capture imprecision or uncertainty expressed as 00°15′00″S 78°35′00″W +/- 300 m, since this requires a conversion which is different for longitude and latitude and longitudes at different latitudes. That is an geocoordinate with an explicit +/- xx m uncertainty cannot be entered in wikidata. This is an acceptable limitation, but it should be understood and clearly stated. In a later mail Denny writes I still would prefer Arcdegree of the equator of the given globe over Meter, as it allows to measure any globe without having too much details about the globe. but otherwise it seems like the same things. (And they can be transformed from one to the other using a simple factor). I think this is not correct. There is no general and simple convertibility between error radius as distance and number of significant digits in degrees. c) if the goal is to store the number of significant digits/figures, I suggest to store this more directly, although I admit that in the presence of different representations (decimal degrees, DMS, etc.) this is not trivial. d) the correct name for globe is Geodetic datum or geodetic system (which is more than the globe). See http://en.wikipedia.org/wiki/Geodetic_system or http://terms.gbif.org/wiki/dwc:geodeticDatum. WGS 84 (as a wikidata item) is a valid geodetic datum or system. Both terms are equally correct. If it shall be applicable outside earth, geodetic datum/system may actually be too narrow, I did not think of that. I don't know the correct superterm then. Maybe just call it system, and explain, that for earth it defines the geodetic or similar datum or system, for other celestial bodys their analogues. I am rather certain that Wikidata does not need to add a further parameter for earth, moon, etc. as Jeroen suggests. I suggest to add to the documentation: The Geodetic or Spatial reference system must be chosen in such a way that it automatically implies the celestial body to which the coordinates apply (Earth, Moon, Mars, Venus, Sun, etc.). I almost believe that this will always be the case, since these system must define the shape of the ellipsoid, which is different for different celestial bodies. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org
Re: [Wikidata-l] Data values
Hm the second one is only relevant for output. I think this is a fundamental misunderstanding: The original one is not for output but is the primary value for interpretation, for understanding whether a value in Wikidata is correct of fake, or a software conversion error, or what. If I want to learn something in Wikipedia, I have to have access to this information. Having access to this I can understand whether to seemingly different values from the same source are justified or an error (like in the example from previous mail that 100 +/- 50 and 100 +/- 0.1 can be both valied for the same quantity and the same source observations). I view the roughly unreliably with lots of heuristics normalized converted version the secondary. It has its uses, but I would in fact put it second and show it only with a large warning banner that this version contains lots of unwarranted assumptions which may or may not hold. But I don't care which is primary or secondary, I only want to encourage you not to forget the data in wikidata over implementing the essential search, retrieval, conversion, etc. functionality. :-) In an ideal world all data would be in a fully convertible state and no-one would simply use significant digits to express margins of error, reliability, tolerance etc. But I have not encountered this world yet. Why not using the Term outputformat as a pattern just like Excel, OpenOffice, and LibreOffice do? This could include the number of digits behind the comma, the optional accuracy/whatever and the unit. This will be fine for the API, and the MW-Syntax. I don't care how the information is encoded, if you develop your own language to encode information in a string and provide a syntax for that that is fine. Only already within Microsoft products the formatting strings are only similar, but not fully compatible, and I have doubts that this is a good way for global interoperability. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
I don't like significant digits because it depends on the writing system (base 10). I'd much rather express this as absolute values. Yes, I would like too. What I argue is that the problem is that you simply in 99.9 % (not a researched of number of course) of cases simply don't know more than that there is a given number of digits base 10. Whether that is meaningful or just sloppy or even a wilfull simplification (probably the vast majority of quantities in current Wikipedia belong to the latter category) is unknown. That means that the figure is not usable for query answering at all. If we don't know the level of certainty, we cannot use the number. that will usually be the case. Unless you know which kind of margin the numbers reflect, you cannot use it for answering anyways. What do you do with the two examples: 100 +/- 50 and 100 +/- 0.1 that are the results of the same dataset and precisely reflect the same quantity? If you know that the first is a 95% measure of dispersion, and the second a 95% CI for the mean, you can ask people whether they look for the mean (best estimate) or for a single observation. Make the interval-points an option. If explicitly entered: excellent information. If not: don't try to create (false) knowledge from void. Yes, it will be an option. Making the default unknown would be bad though, I think. The default has to reflect reality. If you make it a complication to enter the actual default situation, and automatically add a margin of error/dispersion/tolerance whatever then people will simply allow it to happen, start ignoring it, don't understand it, and in the end Wikidata will be known as a bunch of unreliable encoded information. However, we should probably store whether the level of certainty was given explicitly or estimated automatically based on the number of significant digits - then we can still ignore automatic values when desired. Which will force all re-users to understand this and to throw away these values prior to any analysis... Why so complicated? Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
On 21 December 2012 19:36, jmccl...@hypergrove.com wrote: The xsd:minInclusive, xsd:maxInclusive, xsd:minExclusive and xsd:maxExclusive facets are absolute expressions not relative +/- expressions, in order to accommodate fast queries. These four facets permit specification of ranges with an unspecified median and ranges with a specified mode, inclusie or exclusive of endpoints, a six-fer. For these reasons I believe the XSD approach is superior for specifying value set when compared to storing the dispersion factors themselves, eg the 3 of +/- 3. yes, provided they are actually tied to the semantics of min. and maximum, which the xsd examples are. As long as the semantics of the proposed value bracketing in Wikidata is unknown, their use is questionable if not impossible. If I know something is plus/minus 2 s.d. or plus minus 2 s.e. or 10 to 90 % percentile ... I again can use them to the benefit of the query system. But not without. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
I believe there are a lot of dangerous assumptions on http://simia.net/valueparser/ First: there is no indication in a number that it is _not_ endlessly precise. Apostles = 12 has no uncertainty, representing it as 12 ± 1 is wrong, but also 12 ± 0.5 is wrong. The same applies to a number like 12.2. The data source and author MAY desire to express significant digits, but we simply don't know. Wikidata should keep this at the don't know level and not force-convert a number of unknown measurement precision to a number with explicitly stated (but potentially totally wrong) precision or accuracy limits. For example, in science it is quite common to give light microscopic measurement to one decimals behind the micrometer, even though the precision is 0.2 µm. The latter is simply known and therefore not constantly repeated, unless specific circumstances justify this. As discussed above: plus minus 1 s.d. does not give you a confidence interval for the mean, it gives you a measure of dispersion. - My proposal: make the default: plus-minus values unknown, only significant digits known. The interpretation of significant digits is not machine-available unless qualifiers say so. It can however be used to result in an estimate of significant digits after conversion. Make the interval-points an option. If explicitly entered: excellent information. If not: don't try to create (false) knowledge from void. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
In addition to a storage option of the desired unit prefix (this may be considered a original-prefix, since naturally re-users may wish to reformat this). I see no point in storing the unit used for input. I think you plan to store the unit (which would be meter), so you don't want to store prefixes, correct? Please argue why you don't see a point. You want to both the size of the universe, distance to New York, size of the proton in meter? If not, with which algorithm will you restore the SI prefix, or rather, recognize with SI-prefix is usable? We do not use Mm in common language, so we do give the circumference of the earth as roughly 40 000 km and not as 40 Mm. We don't write 4*10^7 m either. it is probably necessary to store the number of significant decimals. That's how Denny proposed to calculate the default accuracy. If the accuracy is given by a complex model (e.g. a gamma distribution), then it might be handy to have a simple value that tells us the significant digits. Hm... perhaps it's best to always express accuracy as +/-n, and allow for more detailed information (standard deviation, whatever) as *additional* information about the accuracy (could be modelled as a qualifier internally). I fear that is two separate levels of precision of giving a measure of measurement _precision_ (I believe accuracy is the wrong term here, precision and accuracy are related but distinct concepts). So 4.10 means that the last digit is significant, i.e. the best estimate is at least between 4.095 and 4.105 (but it may be better). . 4.10 +/- 0.005 means it is precisely 4.095 and 4.105, as opposed to 4.10 +/- 0.004, 4.10 +/- 0.003, 4.10 +/- 0.002 etc. Futhermore, a quantity may be given as 4.10-4.20-4.35. The precision of measurement and the the measure of variance and dispersion are separate concepts. I believe in the user interface this needs not be any visible setting, simply the number of digits can be preserved. Without these is impossible to store and reproduce information like 10.20 nm, it would be returned as 1.02 10^-8 m. No, it would return using whatever system of measurement the user has selected in their preferences. then you have lost the information. There is no user selection in this in science. Complex heuristic may guess when to use the scientific SI prefixes instead. The trailing zero cannot be reproduced however when completely relying on IEEE floating-point. We'll need heuristics to pick the correct secondary unit (e.g. nm or km). The (I believe there is no such thing as a secondary unit, did you make that term up? Only m is a unit of measurement, the n or k are prefixes see http://en.wikipedia.org/wiki/SI_prefix ) general rule could be to pick a unit so that the actual value is between 1 and 10, with some additional rules for dealing with cultural specialities (decimeter is rarely used, hectoliter however is pretty common. The decagram is commonly used in Austria only, etc). You would need to also know which prefix is applicable to which unit in which context. In a scientific context different prefixes are used than in a lay context. In a lay context astronomical temperatures may be given as degree celsius, in a scientific as kelvin. This is not just a user preference. I agree that the system should allow explicit conversion in infoboxes. I disagree that you should create an artifical intelligence system for wikidata that knows more about unit usage than the authors. To store the wisdom of authors, storing both unit and original unit prefix is necessary. You write The Precision can be derived from the accuracy and vice versa, using appropriate heuristics. I _terrible strongly_ doubt that. Can you give any proof of that? For precision I can use statistics, for accuracy and need an indirect, separate and precise method to estimate accuracy. If you have a laser-distance measurement device, the precision can be estimated by yourself by repeated measurements at various times, temperatures, etc. But unless you have an objective distance standard, you have no means to determine whether the accuracy of the device is always off by 10 cm because someone screwed up the software program inside the device. But they are not the same. IMHO, the accuracy should always be stored with the value, the precision never. I fear that is a view of how data in a perfect world should be known, not a reflection of the kind of data that people need to store in Wikidata. Very often only the precision will be known or available to its authors, or worse, the source may not say which it is. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
On 19 December 2012 15:11, Daniel Kinzler daniel.kinz...@wikimedia.de wrote: If they measure the same dimension, they should be saved using the same unit (probably the SI base unit for that dimension). Saving values using different units would make it impossible to run efficient queries against these values, thereby defying one of the major reasons for Wikidata's existance. I don't see a way around this. Daniel confirms (in separate mail) that Wikidata indeed intends to convert any derived SI units to a common formula of base units. Example: a quantity like 1013 hektopascal, the common unit for meterological barometric pressure (this used to be millibar), would be stored and re-displayed as 1.013 10^5 kg⋅m−1⋅s−2 I see several problems with this approach: 1. Many base units are little known. kg⋅m2⋅s−3⋅A−2 for Ohm... It breaks communication with humans curating data on wikidata. It will make it very difficult to compare data entered into wikidata for correctness, because the data displayed after saving will have little relation with the data entered. This makes Wikidata inherently unsuitable for an effort like Wikipedia with many authors and the reliance on fact checking. 2. Even for standard base units, there is often a 1:n relation. e,g, both gray and sievert have the same base unit. The base unit for lumen is candela (because the steradians is not a unit, but part of the derived unit applicability definition) Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
These all pose the same problems, correct. At the moment, I'm very unsure about how to accommodate these at all. Maybe we can have them as custom units, which are fixed for a given property, and can not be converted. I think the proposal to use wikidata items for the units (that is both base and derived SI as well as Imperial units/US customary units) is most sensible. Let people use the units they need. Then write software that picks up the units that people use (after verifying common and correct use) by means of their Wikidata item ID. With successive versions of Wikidata, pick up more and more of these and make them available for conversion. This way Wikidata will become what is needed. I fear the discussion presently is about anticipating the needs of the next years and not allowing any data into wikidata that have not been foreseen. There may be a way that Wikidata can have enough artifical intelligence to predict which unit prefixes are usable in common topics versus scientific topics, which units shall be used. Where Megaton is used (TNT of atomic bombs) and where 10^x ton are preferred (shipping). And that the base unit for weight is kilogram, but for gold in a certain value range ounce may be preferred and gemstones and pearls in carat (http://en.wikipedia.org/wiki/Carat_(unit) ). But I believe forcing Wikidata to solve that problem first and ignoring the wisdom of the users is the wrong path. Modelling Wikidata on the feet versus meter and Fahrenheit versus Celsius problem, where US citizens have a different personal preference is misleading. The issue is much more complex. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
it is probably necessary to store the number of significant decimals. Yes, that *is* the accuracy value i mean. Daniel, please use correct terms. Accuracy is a defined concept and although by convention it may be roughly expressed by using the number of significant figures, that is not the same concept. Without additional information you cannot infer backwards whether usage of significant figures expresses accuracy or precision. See http://en.wikipedia.org/wiki/Accuracy_and_precision Ok, there's some terminology confusion here. I'm using accuracy to refer to the accuracy of measurement (e.g. standard deviation), and precision to refer to the precision of presentation (e.g. significant digits). We need these two things at least, and words for them. I don't care much which words we use. I do. And I think it is important for WIkidata to precisely express what it wants to achieve. Accuracy has nothing to do with s.d., which is a measure of dispersion. You can have an accuracy of +/- 10 measured with a precision of +/- 0.1 (and a standard deviation for the population of objects that you have measured of 2). - So 4.10 means that the last digit is significant, i.e. the best estimate is at least between 4.095 and 4.105 (but it may be better). . 4.10 +/- 0.005 means it is precisely 4.095 and 4.105, as opposed to 4.10 +/- 0.004, 4.10 +/- 0.003, 4.10 +/- 0.002 etc. Yes, all this should be handled by the component responsible for parsing user input for quantity values. But it cannot be because you have lost the information. I don't know whether +/- 0.005 indicates significant figures/digits or whether is is an exact precision_or_accuracy interval. I think this may become clearer if you consider a value entered in inches: 1.20 inches. you convert: 1.20 +/- 0.05 in = 3.048 10^-2 m +/- 1.27 10^-3 m If this is the only information stored, I have no information left whether I should display 3.048 or 3.0480 and whether the information +/- 1.27 10^-3 m is meaningful (no) or an artifact of conversion (yes). It can be stored as an auxilliary data point, that is, as a qualifier (measured in feet). It should not IMHO be part of the data value as such, because that would make it extremely hard to use the values in a database. You are correct insofar that I propose you need to store two units: the normalized one (SI units only, and no prefix - and even though the SI base unit is kg I would store gram) and the original one plus the original unit prefix. If you do that, you can store the value in a single normalized unit, provided you back-convert it prior to display in Wikidata. I don't think the original unit is a meaningless qualifier, it is vital information for context. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
On 19 December 2012 17:03, Daniel Kinzler daniel.kinz...@wikimedia.de wrote: I'd have thought that we'd have one such table per dimension (such as length or weight). It may make sense to override that on a per-property basis, so 2300m elevation isn't shown as 2.3km. Or that can be done in the template that renders the value. here and in the entire discussion I fear that the need to support data curation on Wikidata data for correctness is not sufficiently in the focus. If someone enters the height of a mountain in feet and I see the converted value in meter in my wikidata preferences-converted view, I will correct the seemingly senseless and unjustified precision to three digits after the meter. Only if we understand in which unit the data were originally valid, we will be able to successfully communicate and collaborate. Yes, Wikidata shall store a normalized version of the value, but it also needs to store an original one. Whether it needs to store the value twice I am not sure, I believe not. If it store the original prefix, original unit and original significant digits, it can generally recreate the original form. I know that there are some pitfalls with IEEE numbers in this, and it may be safer to store the original number as well initially (and perhaps drop it later when enough data are available to test the effects). Of course, Wikipedias can use the API to display the value in any other form, just as they like, but that does not solve the problem of data curation on wikidata (which includes the data curation by wikipedia authors). Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
Martynas, I think you misinterpret the thread. There is no discussion not to build on the datatypes defined in http://www.w3.org/TR/xmlschema-2/ What we are doing is discussing compositions of elements, all typed to xml datatypes, that shall be able to express scientific and engineering requirements as to statistics, signficant digits (except perhaps for duration, none of the data types in http://www.w3.org/TR/xmlschema-2/ supports that), as well as means to express uncertainty and confidence intervals. Many existing xml schemata define such compositions, all squarely built on http://www.w3.org/TR/xmlschema-2/ - wikidata is certainly not unique in this effort. If you can point the team to further well reviewed solutions, this would be very useful. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
On 19 December 2012 20:01, jmccl...@hypergrove.com wrote: Hi Gregor - the root of the misconception I likely have about significant digits and the like, is that such is one example of a rendering parameter not a semantic property. It is about semantics, not formatting. In science and engineering, the number of significant digits is not used to right align numbers, but to semantically indicate the order of magnitude of the accuracy and/or precision of a measurement or quantity. Thus, the weight of a machine can be given as 1.2 t (exact to +/- 50 kg), 1200 kg (+/- 1 kg), or 1200.000 g. This is not part of IEEE floating point numbers, which always have the type dependent same precision or number of significant digits, regardless whether this is semantically justified or not. IEEE 754 standard double always has about 16 decimal significant digits, i.e. the value 1.2 tons will always be given as 1.200 tons. This is good for calculations, but lacks the information for final rounding. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
It would be possible and very flexible, and certainly more powerful than the current system. But we would loose the convenience of having one date, which we need for query answering (or we could default to the lower or upper bound, or the middle, but all of these are a bit arbitrary). I believe it would be more profitable to build a query system which always queries for the range. This would work for interval-only values (see my comment on the wiki page) as well as for value with interval. I don't see this as a big overhead. It is more a problem for ordering, but internally, wikidata could store a midpoint value for intervals where no explicit central value is given, and use these for ordering purposes. I think it would be great if the system is consistent for quantities, dates, geographical longitude/latitude, etc. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
Now, I don't think we need or want ranges as a data type at all (better have separate properties for the beginning and end). I am afraid this will then put a heavy burden on users to enter, proofread, and output values. Data input becomes dispersed, because the value 18-25 cm length has to be split and entered separately. You have to write a custom output for each property then, and do all the query logic ( lower, upper) for each property in each Wikipedia client. I believe this is something that is healthy to do centrally. I believe the concept of intervals exists because of that. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
I don't see this as a big overhead. It is more a problem for ordering, but internally, wikidata could store a midpoint value for intervals where no explicit central value is given, and use these for ordering purposes. Well, I would call that mid point simple the value, and the range would be the accuracy. There's an important conceptual distinction here to having ranges as actual values. Can this conceptually distinguish between a meaningful midpoint value, and one that is useful for ordering, but has no meaning and should not be displayed as a result value? See the examples on https://meta.wikimedia.org/wiki/Talk:Wikidata/Development/Representing_values#Missing_central_value Gregor PS: With accuracy you introduce a new concept here which was not in the representing values paper (see http://en.wikipedia.org/wiki/Accuracy_and_precision). This is different from confidence interval (uncertainty) where it is not yet decided whether the value indicates accuracy or dispersion. Confidence interval is a measure of Accuracy only if the sample measurements are normally distributed and if no systematic bias exist. --- I believe it is important that wikidata is flexible enought so it can capture both, especially because in many cases dispersion is used as a rough estimate for otherwise unknown accuracy, and since in many cases there is no true single value and the dispersion is systematic (see e.g. car model length example). ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data values
(ASIDE: Regarding presentation: it is not always algorthmically eay whether to present 0.01 m as 1 * 10e-14 or a 10 fm = 10 * 10-15. In a scientific context, only the SI steps should be used, in another context the closest decimal may be appropriate.) But floating point numbers are handled by the implementation of [[IEEE floating-point standard]]. Displaying the numbers is another question. There I have to agree that it always makes sense to also store a typical used unit for that type of data. I agree. What I propose is that the user interface supports entering and proofreading 10.6 nm as 10.6 plus n (= nano) plus meter. How the value is stored in the data property, whether as 10.6 floating point or as 1.6e-8 is a second issue -- the latter is probably preferable. I only intend to show that scientific values are not always trivial to reverse engineer from a floating point value to the intended value. In addition to a storage option of the desired unit prefix (this may be considered a original-prefix, since naturally re-users may wish to reformat this) it is probably necessary to store the number of significant decimals. I believe in the user interface this needs not be any visible setting, simply the number of digits can be preserved. Without these is impossible to store and reproduce information like 10.20 nm, it would be returned as 1.02 10^-8 m. Complex heuristic may guess when to use the scientific SI prefixes instead. The trailing zero cannot be reproduced however when completely relying on IEEE floating-point. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata license (was Introduction and some questions on Wikidata)
Marco wrote: So I assume that single facts (or database items) are not copyrightable just like single words. Only the database (or even a view?) as a selection and arrangement of various items is copyrightable. Yes, a database may be copyrightable, if the creativity in selecting and arranging information is copyrightable. This possibility exists in many countries and is completely different from database rights. To which extent case law provides a copyright protection and which level of creativity is required varies in each state. In the EU, the database rights directive addresses also database copyrights and tries to harmonize it among EU member states. However, the database right is a completely separate right. It exists only in the EU, and it has nothing to do with the argument above. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Suggestions
If I'm not mistaken this is what we plan to do. The current system we're using for the language links can certainly handle it, as it's not WP specific. that would be great! On top of my wish list are the Wiktionaries, since most of the entities needed to use Wikidata for descriptive knowledge are provided only on summary pages (dozens of terms in one page) on Wikipedias, whereas the Wiktionaries define them as pages. And I believe it will strengthen Wikidata if it can be open to open data initiatives outside of the Wikimedia Foundation. Clearly there needs to be control which initiatives are accepted as valid authoritative sources of identifiers, but I wonder whether the interwiki list is not already a good mechanism for this? If the interwiki list could be supplemented with a generic definition how to make an ajax-identifier lookup call, to present the user a picklist, this could be a huge long-term benefit (i.e. it could be used by Wikidata, but also in any Wikipedia when using a more powerful visual editor). Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata-l Digest, Vol 12, Issue 10
Hi Michael, (I cannot speak on behalf of WMF, this is not official) The plans for Wikidata are to carefully attribute sources of data, which comes much closer to what you need, i.e. 3rd party data are separatable. For the start the CC0 license has been chosen which gives you unlimited re-use rights, but other licenses may be supported later if needed. The discussion was to decide this only when the use cases start to come in. The real data use is still to come, so far wikidata can really only do the interlanguage linking of wikipedia pages. The rest is being worked on right now. Best Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata license (was Introduction and some questions on Wikidata)
Just to clarify, my concern is about externally made databases, regardless of whether these are imported directly into Wikidata, or have been incorporated into Wikipedia first and imported into Wikidata from there. For example, the population data in Wikipedia's list of ceremonial English counties (http://en.wikipedia.org/wiki/List_of_ceremonial_counties_of_England), which also features in the infoboxes of the articles on each county, would I think be covered by database right under U.K law. Like other ONS material, it has been made available under the OGL, which does impose some obligations on re-users (somewhat similar to CC-BY). This is in interesting case. However, while the database right gives you certain rights, it does not give you a copyright (i.e. the conent may be legally problematic, but it cannot be covered by CC BY-SA). Thus, the use on Wikipedia is either exclusively licensed with an obligation to prevent re-use by third parties (which is not the case, WMF does not do this), or it is illegal, or acceptance of open re-use is an implicit waiver of database rights. I believe you can not allow it on Wikipedia but then NOT allow further reuse. However, to clarify: 1. It is much preferable to add such data to Wikidata and include their source in a structured way. Whether OGL or other licenses need to be explicitly supported by Wikidata in the future will have to be a separate discussion, on Wikidata.org. 2. My goal in participating in this discussion is to avoid the impression that re-use of Wikipedia content is not possible at all without looking at each invidivual data element and record. 3. Wikidata plans to support a hierarchy of multiple data for the same statement (multiple values from different sources for a single property in a single item). This makes it possible (although not required) to mix Wikipedia-harvested information with poor sourcing with clean, well sourced data. 4. Not harvesting from Wikipedia implies to verify that almost all information from Wikipedia is in WIkidata, but cleanly sourced, before it si possible to migrate a class of infoboxes to Wikidata. I believe this is an impossible task, making some import of Wikipedia-harvested data necessary. Where better, sourced information exist, these would take precedence. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Introduction and some questions on Wikidata
First of all the priority lies on data already present on Wikipedia. Wikidata should not be a data storage for everything structured in the world, so we should first start to transfer data already present on Wikipedia to Wikidata. External data-sources will be interested as well and for sure but the purpose of wikidata is still cenralizing of what we already have. A side remark, because I believe this needs some further discussion. I agree that Wikidata should be focusing on the kind of data Wikipedia has or which are suitable for Wikipedia. However, since the Wikipedia data are not systematically sourced (they may be unsourced or the source is only available in edit comments, talk page etc.), it will be very valuable to import relevant, sourced data that have the right scope. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] wikidata.org is live (with some caveats)
Changing the language does not really work, the title of the item pages remain in English. http://www.wikidata.org/wiki/Special:RecentChanges?setlang=de Did it have a German label or just a language link to dewp? Probably it did not have a manually entered label at the time. After my post it now has. From what you write and what I tried today I assume that your don't use the Wikipedia-page title in a given language as default for the top line label of the wikidata page? I realize that sometimes the Wikipedia title may not be the best final label but it seems an excellent default. a) a display default e.g. in recent changes. Presently many pages in recentchanges just show a number, although they are connected to many wp articles already. b) as a gray editable default when editing a page. Presently, when in editing mode, the label on top seems to be visible only in a single language. If a label in English has already been entered, there is no access to it, neither read nor write. So when I go to a data page, I usually see NOTHING at the top when my language is set to German. The only way to guess the label of a data item is in fact to use the Wikipedia article titles. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] wikidata.org is live (with some caveats)
Great work, my congratulations! --- Some first impressions: Changing the language does not really work, the title of the item pages remain in English. http://www.wikidata.org/wiki/Special:RecentChanges?setlang=de - (Unterschied | Versionen) . . Sweden (Q34); 17:48 . . (+32) . . Aplasia (Diskussion | Beiträge) (Fügte websitespezifischen [itwiki] Link hinzu: Paesi Bassi) although http://www.wikidata.org/wiki/Q34 does list de: Schweden. I am not sure whether this is a bug or by design? - In German, translation of Item with Datenelement = data element seems odd, a data element is usually something much smaller and atomic. See http://de.wikipedia.org/wiki/Datenelement Proposal: Artikel or Datenobjekt - Change http://www.wikidata.org/ to http://wikidata.org/ ? I think the logo image (upper left corner) looks a bit lost, 10% larger perhaps? It is not even left aligned with the menu text below, which has ample margin Finally: The Q### will become the public face of Wikidata, whereever it is re-used. I think this brand should be less cryptic and use: http://wikidata.org/wiki/WD34 or http://wikidata.org/wiki/W34 (providing a memnonic link to Wikidata) instead of current http://wikidata.org/wiki/Q34 Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] changing wikidata-item properties with multilingual labels
You are right, I mixed them up (that comes from not checking). The usecase for monolingual text are a bit rare, and I am thinking of things like official motto (which is usually not translated), I think if it is only usually not, but sometimes indeed translated, using multilingual for the property would be a better choice. If only one language is available, the language fallback would always end with this. etymological annotations, or the official name of a company (also, usually not translated), usually. Companies sometimes do run under local names (or variations): de: Sanyo Denki K.K. ja: San’yō Denki Kabushiki-gaisha, en: SANYO Electric Co. Ltd. carefully about how delineate them from each other in the entry forms, or otherwise it might end up a bit messy. I think when adding the option for non-linguistic content (= ISO zxx) for language-neutral entities (e.g. for scientific species names, post codes), this type is the least needed. (If anything, it may be more valuable to add a default flag to indicate a primary name that should be used prior to the first in a language fallback. This would be valuable in mixed cases, where a string is translated in a few cases, but not in the majority of languages (the usually not case). Else in a rare border case, where a German company that provides translations to Japanese and Chinese, but not to english, a language fallback chain that does contain German may accidentally end up with Chinese. This solves a border case within the multilingual type which I believe cannot be solved with monolingual text.) Gregor (Often wrong but never in doubt :-) ) ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] watching Wikidata changes that affect my wiki
I think the topic is relevant for the Wikidata editing UI. At the hackathon in Berlin we had discussions about a chain of fallback languages. Have reworked and added some potential user-interface behaviour to http://meta.wikimedia.org/wiki/Wikidata/Notes/Language_fallback Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] changing wikidata-item properties with multilingual labels
A city has a Wikipedia page and a corresponding Wikidata-item-page. One of the item properties is Property:City_mayor. If the mayor changes, and both have their own pages/items (http://de.wikipedia.org/wiki/Eberhard_Diepgen to http://de.wikipedia.org/wiki/Klaus_Wowereit for http://de.wikipedia.org/wiki/Berlin), changing the mayor would mean to disconnect/replace the item to item property. The change would be clean and logical with respect to translated labels. However, where the city mayor is not a well known person (smaller cities), the City_mayor property is mostly likely a string literal. Replacing the string (name) for the mayor in this case would require to empty ALL translations/transliterations in all other languages. Unfortunately, the system cannot really know whether an update of a translated label is the result of a correction (person did not change) or occurs as a result of changing the label. The design of the UI should make this situation as transparent to editors as possible. It may help to provide two edit-buttons for language-sensitive string literals: [edit translations] [edit new value] (or [replace value] ?) In the second case, all existing translations would be blanked. Probably more or better ideas can be found... :-) Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] changing wikidata-item properties with multilingual labels
But humans (and other entities) should not be represented by strings in the system, but by items. I wonder whether this would not be too inflexible. It would burden the use of wikidata with the responsibility to determine entity-identity in all cases where only a name-string is known. In the example of the mayor: Assume that the new mayor of a city is named John Smith. Wikidata already has 500 items for persons named John Smith. The Wikipedia-Wikidata editor must now determine whether it is good practice to simply create wikidata-item 501, not knowing whether it is one of these or not. I fear that the practice is even more problematic in the reverse case. If in a large percentage of cases there is little doubt about identify, this could lead to the practice of always connecting to a wikidata-item for a person, should there be a person of this name. Henceforth, Wikidata would claim that the mayor of Erewhon previously was councilor in Owd-Negrin, even if there is only a chance identity of a name. Wikipedia disambiguation pages know how many homonymic highly notable persons exist - Wikidata will deal with the non- or less-notable ones as well. A well known example is that it is not a good idea for scientific reference management to treat authors as person entities, since the reverse engineering of author identity from the n:m relation between person and name-string is normally not feasible. I would prefer if the decision whether entity-identity is known or whether only a name-string or other label is known, should be left to the Wikidata editor community, and not prescribed by the software. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] DBpedia usage in the bbc
I don't mean to spin this out into a tangent about Drupal. Me neither, my discussion point here is: There are advantages for opaque (like http:something.org/node123456) and nonopaque (http:something.org/Bonn,_Northrhine-Westfalia,_Germany) URI/IRI identifiers. In the light of the use-case of interlinking discussed here: which is right for Wikidata? Does Wikidata need both in parallel (I believe this is the current plan)? Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] reworked storyboard for linking Wikipedia articles
I added some comments on http://meta.wikimedia.org/wiki/Talk:Wikidata/Development/Storyboard_for_linking_Wikipedia_articles_v0.2 Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata Transclusions
While I agree that it is desirable to support simple, preformatted Infoboxes that can, with minimal effort be re-used in a large number of language versions of Wikipedia, I strongly disagree with the demand to make this the only choice. I think the present Wikidata approach to allow local Wikipedias to customize their infoboxes by accessing wikidata properties property-by-property is the right path. The large Wikipedias with many editors have invested considerable creative energy into making quite a large number of infoboxes elaborate information containers. That includes formatting, images and hand-crafted links in both the field name and the field value side. Some values are expressed through svg graphics, other values expressed through background color coding, etc. Limiting the usability of Wikidata to plain vanilla infox boxes could cause considerable resistance in these communities. And although small Wikipedia will profit a lot from Wikidata, without the engagement of editors from the large Wikipedias into curating Wikidata content, the increased synergies will not happen. Another issue is that (I believe that) Wikidata does not have a notion of ordering properties. Correct? This is no issue for the present Wikidata approach because infoboxes remain curated in each local Wikipedia. However, in a centralized one size fits all approach, replacing existing infoboxes where information is presented in a logical order with an alphabetical property order would create huge resistance (and would be a complex issue that Wikidata would have to deal with, allowing property ordering and filtering). I believe that Wikidata correctly aims to provide a smooth transition path, where it is possible to obtain only part of an infobox from wikidata and inject wikidata content into existing infobox layouts. That said: I would encourage a third party contributor to try to create a default Wikidata infobox generator in a way (extension installable in multiple Wikipedias) that enables a wikipedia to autocreate a good looking, plain vanilla infobox with minimal effort. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata Transclusions
On 14 June 2012 12:33, Gerard Meijssen gerard.meijs...@gmail.com wrote: Finally, when Wikidata provides data and info boxes, it does not mean that any project is compelled to use it. As Wikidata matures, it will become increasingly clear that it is not the best practice. that may be, but Wikidata needs a path to get there. I think the ability to integrate wikidata into existing the infobox consensus of a Wikipedia community is essential for adoption. Over time, centrally provided infoboxes with ever increasing customization functionality (order, selection, arrangement, linking properties to Wikipedia pages explaining them, etc.) are desirable and at some point the evolution of wiki data may conclude that this become the preferred method. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Wikidata Transclusions
Gregor, I'm a bit confused -- are you talking about the transclusion design approach in this statement? Yes, in the sense that it demands to be the only access to wiki data content in a Wikipedia. because, if so, I'd think there'd be a number of infobox styles that can be selected by an author on the wikidata platform when 'building' the infobox page. The author can transclude any number/any specific infobox(es) on their wikipedia page, eg {{wikidata:en:Infobox:Some topic/some custom imfobox}} As I say, I look forward to see an infobox builder being developed, but this is a serious challenge. See, e.g. http://en.wikipedia.org/wiki/Tiger and take a look at the hierarchical arrangement of properties, formatting of them, linking of them (Headings link to concept explanations on the same-language Wikipedia, with the link being different than the display text, Early Pleistocene – Recent may be a time range, but the value is Early Pleistocene and the link is Pleistocene; similarly each taxonomic author - here only ony present, Linnaeus - should link on en.wikipedia to en.wikipedia and de.wikipedia to de.wikipedia), expressing some information with graphics, see Endangered (IUCN 3.1)[1], properties or property values containing footnotes, the fact that a subspecies is extinct being abbreviated with a symbol (†P. t. virgata) etc. Note that the latter case is actually a nesting: it is a list of subspecies, with each subspecies having multiple properties like Scientific Name, Wikipedia Page name, extinction status - I am not sure Wikidata plans to model such data in Phase 2 already. My bottomline: Keep the wikidata project manageable and doable with the available resources. Offer a method for Wikipedians to pick up Wikidata content within the existing template infrastrukture. But, desirable: ask a white paper which additional work would be required to create centralized, plain vanilla infobox rendering as well. Would you be willing to create such a whitepaper? How much of the above-shown Tiger example can be created centrally with a limited set of facilities? How feature rich must the customization become? Or are you proposing to simply use the existing template programming with the only the difference that wikidata is the only mediawiki where the properties can be accessed within templates? Much of my argument assumes that you are looking for a non-template based infobox renderer, I may be wrong there. Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] [wikidata-intern] Re: Request for comments: syntax for including data on client wikis (aka how to make infoboxes)
This seems to be everyone's preference, even though it feels kind of icky to me. Oh, well :) I'll rework the draft on that basis soon. I look forward to it. Maybe it runs against some wall, but then we have a better basis for comparison. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] [wikidata-intern] Re: Request for comments: syntax for including data on client wikis (aka how to make infoboxes)
On 23 May 2012 13:19, Daniel Kinzler daniel.kinz...@wikimedia.de wrote: On 23.05.2012 13:14, Nikola Smolenski wrote: If we assume that in practice #data-template is usually going to be wrapped into a template, what's the point of having it at all? Do you see any technical reasons for it? How else do you pass a complex object to a template and make its properties show up as template parameters? I think I might have adressed that in my comment on the wiki. See there, but essentially I believe it is technically equally valid, and from a usability and community adoption standpoint far preferable, to simply support a syntax to adress properties of the complex object, and have the resolver of this syntax automatically pull the entire complex wikidata object (of which the property is a part) into a cache, so that subsequent calls to properties are returned from the cached object. I look forward to have this analyzed by Daniel. Obviously there are some extra things that need to be added, but also other things simply go away painlessly... Can you write a advantage/disadvantage comparison on the wiki, Daniel, to be commented upon? Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Request for comments: syntax for including data on client wikis (aka how to make infoboxes)
I added some comments on the wiki ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink
Wikidata can (and probably will) store information about each moon of Uranus, e.g., its mass. It does probably not make sense to store the mass of Moons of Uranus if there is such an article. It does not help to know that the article Moons on Uranus also talks (among other things) about some moon that has a particular mass: you need to know what *exactly* you are talking about to exploit this data. An article on Moons of Uranus could still (eventually) embed Wikidata data to improve its display, but this data must refer to individual moons, not to the article as a whole. The problem I see is that you have no definition to which real object the data are tied. We agree that the problem is not the interwiki links per se. It is what results from it. How do we tie data to a wikidata page when we don't know what it is about? ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] SNAK - assertion?
Would the Word assertion be a possible replacement for the neonym Snak? ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Re: [Wikidata-l] Data_model: Metamodel: Wikipedialink
On 1 April 2012 13:04, Markus Krötzsch markus.kroetz...@cs.ox.ac.uk wrote: This is a valid point. It is intended to address this as follows: * Wikidata items (our content pages) will be in *exact* correspondence to (zero or more) Wikipedia articles in different languages. * Differences in scope will lead to different Wikidata items. * Relationships such as broader or narrower can be expressed as relations between these items, if desired. This is a technically valid solution. Socially, I fear it would lead to endless uncertainty which mechanism to use. Few abstract entities will have exactly the same delimitation/width, but where should one switch from one method of linking (one wikidata page with several more less closely matching wikipedia pages) to the other (several wikidata pages, one for each wikipedia page in each language)? Also, importing data will be a nightmare, because the concepts used in imported data will have to be compared with all wikipedias. One Wikipedia-language-version has the post-WWII extent of Russia as well as the current and another Wikipedia-language-version has them separated. It may not have mattered before and only one Wikidata page links to both language-versions. However at some point historical data are imported and suddently Wikidata needs to be reorganized to have two pages. ... Just thinking loud - this may be unavoidable perhaps... However, my gut feeling is that if you plan to avoid relations between Wikidata and Wikipedia, it might be a more comprehensible model to then always using only one method, i.e. have a 0 to 1 or 1 to 1 relation between Wikidata page and Wikipedia page only, and express everything else in Wikidata to Wikidata page relations. These relations are then easily traceable and updateable, just as the broadness or narrowness of a page in a given Wikipedia develops over time. In general, Wikidata will not be able to replace all interwiki links: it will remain possible to define additional links in each Wikipedia to cover cases where the relationship between articles is not exact. This worries me. It means that there will be forever conflicting systems of editing interwiki links. If everything can be achieved with Wikipedia, but only a subset with Wikidata, it spells social adoption danger. ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
[Wikidata-l] Data_model: Metamodel: Statement
Some initial ideas on the statement. I realize that this is not priority in the first phase, but perhaps on the wiki a place could be created to collect some thoughts like those below? http://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model#The_Metamodel explains: Statement = (StatementID, Property, Value, Qualifier*, Reference*) The StatementID is a unique identifier for the given statement and used only internally and for export. A Property is defined on a property page. This definition includes a type. The structure of the Value is given by the type of the property. Simple types could demand an EntityID or a number. More complex types could demand dates, date ranges, numbers with units, a geocoordinate, a geo shape, etc. 1. Will Property include information on observation/recording/measurement methodology? 2. What happens if the main entity has variants or parts for which values are recorded? An example is car models, where typically the revisions sold under the same name are subsumed in one Wikipedia article. Example: http://en.wikipedia.org/wiki/Renault_Kangoo with different length / weight in the subclass info boxes for first/second generation. Cars also easily serve as an example for variable parts, see in http://en.wikipedia.org/wiki/%C5%A0koda_Roomster the list of engine specifications. The engines are not Wikipedia entities in their own right. 3. Values may be well defined RDF-resources, but now available in Wikipedia. In my work, many statements I would like to express in a future Wikidata are not allowed as Wikipedia articles at all. You can express Wowereit is_mayor_of Berlin, Germany but not Plantago lanceolata has_leaf_shape lanceolate because http://en.wikipedia.org/wiki/Lanceolate is a redirect to http://en.wikipedia.org/wiki/Leaf_shape I personally would love to have illustrated definitions of things people want to learn about being allowed on Wikipedia, but the argument is generally that Wikipedia is not Wiktionary. I believe Wikidata should right from the start be defined to allow references to Wiktionary as well as Wikipedia. And while we are at it, references to Commons as well (semantic image annotation...) This would change Wikipedialink = (Title, LanguageId, Badge?) to Link = (Project, LanguageId, Title, Badge?) --- Gregor ___ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l