@Markus, James:
In my opinion it is better to make the query ask for the most recent
population number. People just need to start using time-qualifiers for
things like census-report numbers.

And the other issue is one of standardized vocabulary and that is always a
sourcing problem in my opinion. A query could say "get the
instance-of-statement" that has a supporting source from the Spanish
Geographic Society. Then the infobox would only include standardized
vocabulary by that organization. But I aknowledge that large parts of the
world are not covered by standardized vocabulary organizations.

If that doesn't solve it we could at least think about language specific
rank-overrides.

-Tobias


2015-11-27 16:41 GMT+01:00 Markus Krötzsch <mar...@semantic-mediawiki.org>:

> Hi James,
>
> I would immediately agree to the following measures to alleviate your
> problem:
>
> (1) If some instance-of statements are historic (i.e., no longer valid),
> then one should make the current ones "preferred" and leave the historic
> ones "normal", just like for, e.g., population numbers. This would get rid
> of the rather inappropriate "Free imperial city" label for Frankfurt.
>
> (2) If some classes are redundant, they could be removed (e.g., if we
> already have "Big city" we do not need "city"). However, community might
> decide to prefer the direct use of a main class (such as "Human"), even if
> redundant.
>
> The other issues you mention are more tricky. Especially issues of
> translation/cultural specificity. The most specific classes are not always
> the ones that all languages would want to see, e.g., if the concept of the
> class is not known in that language.
>
> Possible options for solving your problem:
>
> * Make a whitelist of classes you want to show at all in the template, and
> default to "city" if none of them occurs.
> * Make a blacklist of classes you want to hide.
> * Instead of blacklist or whitelist, show only classes that have a
> Wikipedia page in your language; default to "city" if there are none.
> * Try to generalise overly specific classes (change "big city" to "city"
> etc.). I don't know if there is a good programmatic approach for this, or
> if you would have to make a substitution list or something, which would not
> be very maintainable.
> * Do not use instance-of information like this in the infobox. It might
> sound radical, but I am not sure if "instance of" is really working very
> well for labelling things in the way you expect. Instance-of can refer to
> many orthogonal properties of an object, in essentially random order, while
> a label should probably focus on certain aspects only.
>
> For obvious reasons, ranks of statements cannot be used to record
> language-specific preferences.
>
> Cheers,
>
> Markus
>
>
> On 27.11.2015 15:58, James Heald wrote:
>
>> Some items have quite a lot of "instance of" statements, connecting them
>> to quite a few different classes.
>>
>> For example, Frankfurt is currently an instance of seven different
>> classes,
>>      https://www.wikidata.org/wiki/Q1794
>>
>> and Glasgow is currently an instance of five different classes:
>>      https://www.wikidata.org/wiki/Q4093
>>
>> This can produce quite a pile-up of descriptions in the
>> description/subtitle section of an infobox -- for example, as on the
>> Spanish page for Frankfurt at
>>      https://es.wikipedia.org/wiki/Fr%C3%A1ncfort_del_Meno
>> in the section between the infobox title and the picture.
>>
>>
>> Question:
>>
>> Is it an appropriate use of ranking, to choose a few of the values to
>> display, and set those values to be "preferred rank" ?
>>
>> It would be useful to have wider input, as to whether it is a good thing
>> as to whether this is done widely.
>>
>> Discussions are open at
>>
>> https://www.wikidata.org/wiki/Wikidata:Project_chat#Preferred_and_normal_rank
>>
>> and
>> https://www.wikidata.org/wiki/Wikidata:Bistro#Rang_pr.C3.A9f.C3.A9r.C3.A9
>>
>> -- but these have so far been inconclusive, and have got slightly taken
>> over by questions such as
>>
>> * how well terms really do map from one language to another --
>> near-equivalences that may be near enough for sitelinks may be jarring
>> or insufficient when presented boldly up-front in an infobox.
>>
>> (For example, the French translation "ville" is rather unspecific, and
>> perhaps inadequate in what it conveys, compared to "city" in English or
>> "ciudad" in Spanish; "town" in English (which might have over 100,000
>> inhabitants) doesn't necessarily match "bourg" in French or "Kleinstadt"
>> in German).
>>
>> * whether different-language wikis may seek different degrees of
>> generalisation or specificity in such sub-title areas, depending on how
>> "close" the subject is to that wiki.
>>
>> (For readers in some languages, some fine distinctions may be highly
>> relevant and familiar, whereas for other language groups that level of
>> detail may be undesirably obscure).
>>
>>
>> There is also the question of the effect of promoting some values to
>> "preferred rank" for the visibility of other values in SPARQL -- in
>> particular when so queries are written assuming they can get away with
>> using just the simple "truthy" wdt:... form of properties.
>>
>> However, making eg the value "city" preferred for Glasgow means that it
>> will no longer be returned in searches for its other values, if these
>> have been written using "wdt:..." -- so it will now be missed in a
>> simple-level query for "council areas", the current top-level
>> administrative subdivisions of Scotland, or for historically-based
>> "registration counties" -- and this problem will become more pronounced
>> if the practice becomes more widespread of making some values
>> "preferred" (and so other values invisible, at least for queries using
>> wdt:...).
>>
>>  From a SPARQL point of view, what would actually be very helpful would
>> to add a (new) fourth rank -- "misleading without qualifier", below
>> "normal" but above "deprecated" -- for statements that *are* true (with
>> the qualifiers), but could be misleading without them
>> * for example, for a town that was the county town of a shire once, but
>> hasn't been for two centuries
>> * or for an administrative area that is partly located in one
>> higher-level division, and partly in another -- this is very valuable
>> information to be able to note, but it's important to be able to exclude
>> it from being all included in a recursive search for the places in one
>> (but not the other) of that higher-level division.
>>
>> The statements shouldn't be marked "deprecated", because they are true
>> (unlike a widely-given but incorrect date of birth, for example).  At
>> the moment one can sort of work round the issue, if one can find another
>> statement to make "preferred", so that the qualified statement becomes
>> invisible to a simple search without qualifiers.  However, if
>> "preferred" status is going to be used just to select things to show in
>> infoboxes, it becomes very desirable that "wdt:..." searches should
>> retrieve things at normal rank as well -- creating a need for a new rank
>> for statements which are true, but misleading if read without qualifiers.
>>
>>
>> What *is* needed though, is a view on whether trying to tailor what is
>> shown in infoboxes is an appropriate reason to alter statement rankings.
>>
>> It would be good to get a view on this.
>>
>> The Spanish guys who stated doing this have temporarily put further
>> rank-changes on hold, for the issue to be discussed; but so far what
>> they have done has only just scratched the surface of what could be done
>> -- there are still a lot more cases of multiple values they would like
>> to tidy.
>>
>> So: is this the kind of thing that "preferred rank" is envisaged for ?
>>
>> Or, should some statements not be marked as less preferred than others,
>> if this is the only reason ?
>>
>>
>>     --  James.
>>
>>
>> _______________________________________________
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to