Markus Krötzsch, 04/08/2013 12:32:
* Wikidata uses "be-x-old" as a code, but MediaWiki messages for this
language seem to use "be-tarask" as a language code. So there must be a
mapping somewhere. Where?

Where I linked it.

* MediaWiki's http://www.mediawiki.org/wiki/Manual:$wgDummyLanguageCodes
provides some mappings. For example, it maps "zh-yue" to "yue". Yet,
Wikidata use both of these codes. What does this mean?

Answers to Nemo's points inline:

On 04/08/13 06:15, Federico Leva (Nemo) wrote:
Markus Krötzsch, 03/08/2013 15:48:
(3) Limited language support. The script uses Wikidata's internal
language codes for string literals in RDF. In some cases, this might not
be correct. It would be great if somebody could create a mapping from
Wikidata language codes to BCP47 language codes (let me know if you
think you can do this, and I'll tell you where to put it)

These are only a handful, aren't they?

There are about 369 language codes right now. You can see the complete
list in langCodes at the bottom of the file

https://github.com/mkroetzsch/wda/blob/master/includes/epTurtleFileWriter.py


Most might be correct already, but it is hard to say.

Only a handful are incorrect, unless Wikidata has specific problems (no idea how you reach 369).

Also, is it okay
to create new (sub)language codes for our own purposes? Something like
simple English will hardly have an official code, but it would be bad to
export is as "en".


(4) Limited site language support. To specify the language of linked
wiki sites, the script extracts a language code from the URL of the
site. Again, this might not be correct in all cases, and it would be
great if somebody had a proper mapping from Wikipedias/Wikivoyages to
language codes.

Apart from the above, doesn't wgLanguageCode in
https://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php
have what you need?

Interesting. However, the list there does not contain all 300 sites that
we currently find in Wikidata dumps (and some that we do not find there,
including things like dkwiki that seem to be outdated). The full list of
sites we support is also found in the file I mentioned above, just after
the language list (variable siteLanguageCodes).

Of course not all wikis are there, that configuration is needed only when the subdomain is "wrong". It's still not clear to me what codes you are considering wrong.

Nemo

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to