LucasWerkmeister added a comment.
In T236593#8093121 <https://phabricator.wikimedia.org/T236593#8093121>, @C933103 wrote: > As an English example, some religious people might refuse to write the name "God" out directly as it is as this would constitute idolatry. For this we can tag it as en-x-Qnnnn for which Qnnnn refer to religious group of people, but there are more than one alternative way to write "God". They can either write "G-d", "G*d", "G_d", "G-o-d", and so on. It would make no contextual differences in whether a hyphen or a underscore is being used, and the change in which exact symbol being used in place of original alphabet wouldn't affect pronunciation or religious connection. Hence all of these alternatives should be tagged en-x-Qnnnn, and with the patch it would be possible to have "en-x-Qnnnn-1" being "G-d" while "en-x-Qnnnn-2" being "G*d". I can't see how more specific labels can be useful in differentiating "G-d" and "G*d" I don’t follow this example. If you think all of these potential forms are significant, and all of them should be tracked in Wikidata, then why do you want to combine them all under a single item ID where nobody can tell them apart? To me it makes more sense (assuming this data is notable at all) to have separate items like “bowdlerized using hyphens”, “bowdlerized using asterisks”, etc., which can be subclasses of a more general “avoiding idolatry” item, have other statements indicating which character is being used, and so on. (“Bowdlerized” definitely isn’t the right word here, but I don’t know what the right word is, sorry.) In T236593#8097326 <https://phabricator.wikimedia.org/T236593#8097326>, @AGutman-WMF wrote: > @LucasWerkmeister I agree with you that if two variants have two different pronunciation, they should probably be split into two different lexemes (in general, I think we should avoid having multiple forms with the same grammatical features within one lexeme). There is some leeway, however, in this rule, since different dialects may have slightly different pronunciations which we still want to group into a single lexeme/form. For instance American English "color" and British English "colour" are in fact pronounced slightly differently, but it would be over-kill to split them, since the difference in pronunciation is systematic between the dialects. That’s fair, and I actually almost wrote “if //the same// speaker would pronounce them…” in my comment :) I’m not sure how exactly to phrase the rule, but mainly I’m glad to have found some rule at all (which I’m not sure I really understood, at least consciously, back in 2019 when I was apparently sitting next to @jhsoby). TASK DETAIL https://phabricator.wikimedia.org/T236593 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: LucasWerkmeister Cc: LucasWerkmeister, C933103, AGutman-WMF, mxn, So9q, Ijon, daniel, Asaf, Mahir256, Danmichaelo, Fnielsen, Lucas_Werkmeister_WMDE, Denny, Lydia_Pintscher, jeblad, jhsoby, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org