Michael added a comment.
I notice, I'm still a bit confused as to where CLDR is getting its languages from. Partly from core, partly from a manually maintained list (localNamesXX.php), but there are also comments like `# Added to Core, not part of CLDR, T287345`. What is that CLDR mentioned in the CLDR extension itself? In T341409#9148879 <https://phabricator.wikimedia.org/T341409#9148879>, @Lucas_Werkmeister_WMDE wrote: > There is a slight ambiguity in the task description that I didn’t realize before. If we take it literally, and only pass LanguageNameUtils::ALL as the second getLanguageNames() argument while leaving the first argument the same (LanguageNameUtils::AUTONYMS, the default), then we won’t actually see any difference That would be due to if ( $inLanguage !== self::AUTONYMS ) { # TODO: also include for self::AUTONYMS, when this code is more efficient // @phan-suppress-next-line PhanTypeMismatchArgumentNullable False positive $this->hookRunner->onLanguageGetTranslatedLanguageNames( $names, $inLanguage ); } in LanguageNameUtils.php. That means when requesting Autonyms, the extra languages from CLDR are not loaded. There seems to be a mistake in the description. The languages in CldrNamesEn.php are the MedaWiki ones (that is what rebuild.php uses), the //additional// languages that we care about would seem to be the ones coming from LocalNamesEn.php <https://github.com/wikimedia/mediawiki-extensions-cldr/blob/master/LocalNames/LocalNamesEn.php> and parallel files, right? In T341409#9162199 <https://phabricator.wikimedia.org/T341409#9162199>, @Nikki wrote: > In T341409#9148879 <https://phabricator.wikimedia.org/T341409#9148879>, @Lucas_Werkmeister_WMDE wrote: > >> The additional cldr language codes are only added when asking for language names in a specific language, and the returned language codes vary slightly depending on which language you ask for: >> [...] >> (`de` and `bar` have additionally `en-uk`, with `bar` presumably inheriting it from `de` via language fallback; `pt`’s extra language code is `az-arab`.) I assume we always want to request the same language here, rather than make this depend on the user / request language; should it be the wiki content language (`en` on Wikidata), a hard-coded one (e.g. `en` or `qqq`), or something else? > > Hm, that doesn't sound good. Is that actually a bug in the CLDR extension? I would expect the set of language codes to be the same regardless of the language being used and that not being the case sounds like it would cause problems eventually. Perhaps it should have tests to make sure none of the files have extra codes that don't exist for English, or perhaps it should ignore any codes that aren't defined for all languages? Making the extension translatable would help here too, I imagine. `en-uk` (together with `en-gb`) was added in Add some German translation (I0ce22dfc) <https://gerrit.wikimedia.org/r/c/mediawiki/extensions/cldr/+/474682>. CLDR seems to be defacto used as a repository for names for language codes that happen to be used by people. Not at all as an authoritative source for language codes. Are we ok with using it anyway? Also, I note that a lot of language names that have been added there seem to include a comment `# used by Wikidata, T123456`. So we may still want a process to add more, given that our current process is how we got to this list. **Further Async Storywriting notes:** Needs AC, aside from the one for actually doing the thing, also one or more for updating docs/policy/process which exists at least in the following places: - T312845: [Process] Add new language codes to Wikidata <https://phabricator.wikimedia.org/T312845> - https://phabricator.wikimedia.org/project/profile/4981/ - https://www.mediawiki.org/wiki/Manual:Adding_and_removing_languages#Wikibase Also, should have an AC to go through the existing language related tasks and figure out which are still needed, maybe update them, and close the ones no longer needed after this one here is done. TASK DETAIL https://phabricator.wikimedia.org/T341409 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Michael Cc: Michael, ItamarWMDE, Bugreporter, thiemowmde, Lucas_Werkmeister_WMDE, jhsoby, Amire80, Lydia_Pintscher, Manuel, mrephabricator, Nikki, Danny_Benjafield_WMDE, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, Mahir256, QZanden, srishakatux, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org