Nikki created this task.
Nikki added projects: Wikidata, Language codes.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  MediaWiki has a mapping for language codes in 
includes/language/LanguageCode.php 
<https://gerrit.wikimedia.org/g/mediawiki/core/+/d23c1743906a8be52afeeecde3dd69c070cd7376/includes/language/LanguageCode.php#81>.
 Wikibase has its own mapping in repo/config/Wikibase.default.php 
<https://gerrit.wikimedia.org/g/mediawiki/extensions/Wikibase/+/9aabebb98c911c3f3760251043ccce4a7424094f/repo/config/Wikibase.default.php#185>.
  
  Some are the same:
  
  | Code          | MediaWiki and Wikibase |
  | `de-formal`   | `de-x-formal`          |
  | `es-formal`   | `es-x-formal`          |
  | `hu-formal`   | `hu-x-formal`          |
  | `map-bms`     | `jv-x-bms`             |
  | `nl-informal` | `nl-x-informal`        |
  | `simple`      | `en-simple`            |
  |
  
  Some are different:
  
  | Code       | MediaWiki           | Wikibase    |
  | ---------- | ------------------- | ----------- |
  | `cbk-zam`  | `cbk`               | `cbk-x-zam` |
  | `crh`      | `crh` (not changed) | `crh-Latn`  |
  | `nrm`      | `nrf`               | `fr-x-nrm`  |
  | `roa-tara` | `nap-x-tara`        | `it-x-tara` |
  |
  
  The Wikibase mapping is only used for sitelinks in RDF (as far as I can 
tell). Elsewhere in RDF, they are not converted (the ticket for that is T243428 
<https://phabricator.wikimedia.org/T243428>). When displaying entities, the 
HTML `lang` attributes use the MediaWiki mapping. This results in the same 
language code being standardised in different ways.
  
  For example: On https://www.wikidata.org/wiki/Q5296, the 
roa-tara.wikipedia.org sitelink has `lang="nap-x-tara"` and 
`hreflang="nap-x-tara"` in the HTML and on https://roa-tara.wikipedia.org/ the 
`<html>` element has `lang="nap-x-tara"`, whereas the RDF 
<https://www.wikidata.org/wiki/Special:EntityData/Q5296.ttl> has 
`schema:inLanguage "it-x-tara"` and `schema:name "Pagene Prengepále"@it-x-tara`.
  
  These are describing the same text/page and HTML and RDF both use the same 
standard for language codes (BCP 47) so the language code should be the same in 
both places.
  
  The function which uses Wikibase's mapping (in 
repo/includes/Rdf/RdfVocabulary.php 
<https://gerrit.wikimedia.org/g/mediawiki/extensions/Wikibase/+/9aabebb98c911c3f3760251043ccce4a7424094f/repo/includes/Rdf/RdfVocabulary.php#469>)
 already uses `LanguageCode::bcp47` (which uses MediaWiki's mapping), so 
perhaps Wikibase doesn't need its own mapping at all. If it needs to be 
possible to customise the mapping, it would probably make more sense for the 
MediaWiki list to be customisable.

TASK DETAIL
  https://phabricator.wikimedia.org/T360244

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Nikki
Cc: Aklapper, Nikki, Danny_Benjafield_WMDE, mrephabricator, Astuthiodit_1, 
MaryMunyoki, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, 
KimKelting, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to