@Markus, James: In my opinion it is better to make the query ask for the most recent population number. People just need to start using time-qualifiers for things like census-report numbers.
And the other issue is one of standardized vocabulary and that is always a sourcing problem in my opinion. A query could say "get the instance-of-statement" that has a supporting source from the Spanish Geographic Society. Then the infobox would only include standardized vocabulary by that organization. But I aknowledge that large parts of the world are not covered by standardized vocabulary organizations. If that doesn't solve it we could at least think about language specific rank-overrides. -Tobias 2015-11-27 16:41 GMT+01:00 Markus Krötzsch <mar...@semantic-mediawiki.org>: > Hi James, > > I would immediately agree to the following measures to alleviate your > problem: > > (1) If some instance-of statements are historic (i.e., no longer valid), > then one should make the current ones "preferred" and leave the historic > ones "normal", just like for, e.g., population numbers. This would get rid > of the rather inappropriate "Free imperial city" label for Frankfurt. > > (2) If some classes are redundant, they could be removed (e.g., if we > already have "Big city" we do not need "city"). However, community might > decide to prefer the direct use of a main class (such as "Human"), even if > redundant. > > The other issues you mention are more tricky. Especially issues of > translation/cultural specificity. The most specific classes are not always > the ones that all languages would want to see, e.g., if the concept of the > class is not known in that language. > > Possible options for solving your problem: > > * Make a whitelist of classes you want to show at all in the template, and > default to "city" if none of them occurs. > * Make a blacklist of classes you want to hide. > * Instead of blacklist or whitelist, show only classes that have a > Wikipedia page in your language; default to "city" if there are none. > * Try to generalise overly specific classes (change "big city" to "city" > etc.). I don't know if there is a good programmatic approach for this, or > if you would have to make a substitution list or something, which would not > be very maintainable. > * Do not use instance-of information like this in the infobox. It might > sound radical, but I am not sure if "instance of" is really working very > well for labelling things in the way you expect. Instance-of can refer to > many orthogonal properties of an object, in essentially random order, while > a label should probably focus on certain aspects only. > > For obvious reasons, ranks of statements cannot be used to record > language-specific preferences. > > Cheers, > > Markus > > > On 27.11.2015 15:58, James Heald wrote: > >> Some items have quite a lot of "instance of" statements, connecting them >> to quite a few different classes. >> >> For example, Frankfurt is currently an instance of seven different >> classes, >> https://www.wikidata.org/wiki/Q1794 >> >> and Glasgow is currently an instance of five different classes: >> https://www.wikidata.org/wiki/Q4093 >> >> This can produce quite a pile-up of descriptions in the >> description/subtitle section of an infobox -- for example, as on the >> Spanish page for Frankfurt at >> https://es.wikipedia.org/wiki/Fr%C3%A1ncfort_del_Meno >> in the section between the infobox title and the picture. >> >> >> Question: >> >> Is it an appropriate use of ranking, to choose a few of the values to >> display, and set those values to be "preferred rank" ? >> >> It would be useful to have wider input, as to whether it is a good thing >> as to whether this is done widely. >> >> Discussions are open at >> >> https://www.wikidata.org/wiki/Wikidata:Project_chat#Preferred_and_normal_rank >> >> and >> https://www.wikidata.org/wiki/Wikidata:Bistro#Rang_pr.C3.A9f.C3.A9r.C3.A9 >> >> -- but these have so far been inconclusive, and have got slightly taken >> over by questions such as >> >> * how well terms really do map from one language to another -- >> near-equivalences that may be near enough for sitelinks may be jarring >> or insufficient when presented boldly up-front in an infobox. >> >> (For example, the French translation "ville" is rather unspecific, and >> perhaps inadequate in what it conveys, compared to "city" in English or >> "ciudad" in Spanish; "town" in English (which might have over 100,000 >> inhabitants) doesn't necessarily match "bourg" in French or "Kleinstadt" >> in German). >> >> * whether different-language wikis may seek different degrees of >> generalisation or specificity in such sub-title areas, depending on how >> "close" the subject is to that wiki. >> >> (For readers in some languages, some fine distinctions may be highly >> relevant and familiar, whereas for other language groups that level of >> detail may be undesirably obscure). >> >> >> There is also the question of the effect of promoting some values to >> "preferred rank" for the visibility of other values in SPARQL -- in >> particular when so queries are written assuming they can get away with >> using just the simple "truthy" wdt:... form of properties. >> >> However, making eg the value "city" preferred for Glasgow means that it >> will no longer be returned in searches for its other values, if these >> have been written using "wdt:..." -- so it will now be missed in a >> simple-level query for "council areas", the current top-level >> administrative subdivisions of Scotland, or for historically-based >> "registration counties" -- and this problem will become more pronounced >> if the practice becomes more widespread of making some values >> "preferred" (and so other values invisible, at least for queries using >> wdt:...). >> >> From a SPARQL point of view, what would actually be very helpful would >> to add a (new) fourth rank -- "misleading without qualifier", below >> "normal" but above "deprecated" -- for statements that *are* true (with >> the qualifiers), but could be misleading without them >> * for example, for a town that was the county town of a shire once, but >> hasn't been for two centuries >> * or for an administrative area that is partly located in one >> higher-level division, and partly in another -- this is very valuable >> information to be able to note, but it's important to be able to exclude >> it from being all included in a recursive search for the places in one >> (but not the other) of that higher-level division. >> >> The statements shouldn't be marked "deprecated", because they are true >> (unlike a widely-given but incorrect date of birth, for example). At >> the moment one can sort of work round the issue, if one can find another >> statement to make "preferred", so that the qualified statement becomes >> invisible to a simple search without qualifiers. However, if >> "preferred" status is going to be used just to select things to show in >> infoboxes, it becomes very desirable that "wdt:..." searches should >> retrieve things at normal rank as well -- creating a need for a new rank >> for statements which are true, but misleading if read without qualifiers. >> >> >> What *is* needed though, is a view on whether trying to tailor what is >> shown in infoboxes is an appropriate reason to alter statement rankings. >> >> It would be good to get a view on this. >> >> The Spanish guys who stated doing this have temporarily put further >> rank-changes on hold, for the issue to be discussed; but so far what >> they have done has only just scratched the surface of what could be done >> -- there are still a lot more cases of multiple values they would like >> to tidy. >> >> So: is this the kind of thing that "preferred rank" is envisaged for ? >> >> Or, should some statements not be marked as less preferred than others, >> if this is the only reason ? >> >> >> -- James. >> >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata