C933103 added a comment.
In T236593#5610378 <https://phabricator.wikimedia.org/T236593#5610378>, @daniel wrote: > I recall that we had long discussions about this when initially deciding on the data model. In technical terms, the question was whether we would allow only a single literal value for a spelling variant, or a list or set of words. Allowing a list or set would enable the kind of flexibility @jhsoby is asking for. But the down side is that it introduces ambiguity when listing forms (you would always have to list all of them, in undefined order), and when generating text (which one should you use)? > > If I recall correctly, we decided that we want to give the consumer of the data maximum control over which variant they prefer, by forcing the producer to provide different variant codes for all different spellings. We had discussions about how to encode this in the variant (language) codes, and how to represent it in the UI, but decided to leave that for later. > > So, the solution that we envisioned when originally discussing this about four years ago was: you make up a code for each of the spellings, in a way that allows the consumer to choose which variant they prefer. If that is done by encoding a region or a rhyme or a tradition or school or whatever will depend on the language. If it's a stylistic choice, name the style. > > The same approach can be used for historical spellings. codes could look something like de-x-hist-nd-15jh or something (this code is totally made up and probably linguistically nonsense). The underlying assumption behind this decision is that, different spelling forms must be associated with certain variant, or that there are some of the spelling being preferred over other spellings, or that some spelling is more commonly used for some spoken variant/sociolet/etc than others and is other spelling. None of these are correct assumption, when it come to non-Chinese languages that use Chinese characters, or even some Chinese languages that need to apply Chinese characters. Example of Vietnamese chu nom have already been presented above. Other examples includes Japanese ateji when Kanji are used for Japanese native words except cases where there have been full established transliteration, and its Korean equivalent in history, as well as in languages like Cantonese when non-Mandarin words need to be expressed in Chinese characters. TASK DETAIL https://phabricator.wikimedia.org/T236593 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: C933103 Cc: C933103, AGutman-WMF, mxn, So9q, Ijon, daniel, Asaf, Mahir256, Danmichaelo, Fnielsen, Lucas_Werkmeister_WMDE, Denny, Lydia_Pintscher, jeblad, jhsoby, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Bodhisattwa, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org