[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-09-12 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-09-12 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-09-06 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.


  I just realized that we probably don’t want to prohibit duplication of mul 
labels under all circumstances. Consider an item that has the mul label A and 
the pt label B. Now suppose that pt-br users should see the label as A. In that 
case, it should be allowed to set the pt-br label to A, even though that’s the 
same as the mul label – because it’s not redundant: it overrides the pt label.
  
  I think I would implement this as: when editing a non-mul label, get the 
label in that language for the item, assuming the label doesn’t exist in the 
language itself; if the resulting term fallback is in mul, then only allow the 
edit if the label is different from the mul fallback; but if the term fallback 
is in any other language, then don’t compare anything with the mul label. 
(Maybe we’ll need to optimize this code, e.g. by first checking if a mul label 
exists at all, before computing term fallbacks for all the languages affected 
by the edit.)

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-09-06 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-09-06 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-08-18 Thread mrephabricator
mrephabricator added a comment.


  Another example to consider - the dinosaur Changdusaurus (en) was first 
described in Chinese sources as 昌都龍. In Vietnamese, this dinosaur is called 
Changtusaurus, having been transliterated from Chinese using Vietnamese Latin 
script. (That other languages have duplicated the English name is likely 
incidental - there is no reason to prefer one over the other, and like many 
dinosaur names, this represents a genus but not one with a Latin taxon name.) 
If a different dinosaur name derived the same way in Vietnamese and English 
happened to match, that would not mean they have the same name in each 
language, since the shared letters don't represent the same sound. Should that 
"duplicate" get removed we could say that it would not matter because a query 
would return the same fallback anyway, but the same would be true for dinosaurs 
which never had a Vietnamese name entered to begin with. The information about 
which labels exactly would be homographic between which languages would be 
gone, and a certain amount of unrecoverable data would be gone. This would make 
working with data within a given language harder as there would be no way to 
tell between mul (fallback added for English and Swedish) and mul (differently 
pronounced English and Hawaiian words happened to be written the same way) 
further skewing the data quality outside of a handful of popular languages. At 
least ensuring that "mul" is understood as meaning "multiple languages" and not 
"Latin script" could prevent some of this from happening.
  
  I think it would be fitting that preference be given to labels which would 
not fit anywhere else but would be legible in other languages. For example, if 
the Balti name of a town in Gilgit-Baltistan is added to mul in absence of a 
bft Balti code, it would likely be legible to Urdu readers or Kashmiri readers 
and so on. Then if readers of those uncoded languages are using Urdu or English 
as a locale, they would still be able to get these names as a fallback.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mrephabricator
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-08-16 Thread mrephabricator
mrephabricator added a comment.


  It's entirely possible that duplicate labels are not a real problem - there 
has been heated debate about this same thing on OpenStreetMap for years at this 
point, but the consensus has always been to keep the "duplicates" as they 
really contain information that data consumers can't do with out. Many of the 
detractors allege that Wikidata would be able to store this information should 
it be removed, but if that becomes no longer true, it seems like that could 
damage Wikidata's credibility as a useful tool for interlingual labels, as so 
far it has been discussed as a way to store more of that kind of information 
rather than less of it.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mrephabricator
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-08-16 Thread mrephabricator
mrephabricator added a comment.


  This should not be done. ک in Urdu is ڪ in Sindhi, but Sindhi still has ک but 
uses it for a different sound. It is exceptional in this regard, so it would 
not be surprising for the "mul" label to be read as using ک to represent what 
it does more commonly. This would mean that a label in Sindhi could be 
identical to an Urdu one while representing a word that is meant to be 
pronounced distinctly from the Urdu one. This likely extends to most scripts.
  
  "W" and "v" are homophonous sounds to many users of Latin scripts. For 
example with Latin script, if we look at this item: 
https://www.wikidata.org/wiki/Q113450202
  I have labeled this in English as "Waddi Punjabi Lughat" as this is how many 
South Asian English speakers and users of Latin script would be inclined to 
spell it. However, Vaddi Punjabi Lughat is the label I have used for Canadian, 
American, and British English because to speakers of these English dialects, 
the sound they would associate with "V" would be a closer match to the correct 
pronunciation. If I were to duplicate the label across dialects, this would be 
indicating the useful information that the "W" would be understood as a typical 
spelling in all of them, meaning that it would be reasonable for an American to 
pronounce "Waddi" like "water" even if this is not the "original" 
pronunciation. That makes duplicating the label an indicator of useful 
information which would not be clear otherwise.
  
  I think it is quite likely that people will use homoglyph letters as 
substitutes to get around this, or even unintentionally. For example, ڻ and ٹ 
are different letters which are associated with different sounds. However, they 
look identical in middle and initial positions. So if we have ڻڻڻ and ٹٹٹ, you 
would have a hard time telling what the first two letters are. There are lots 
of things we can fudge like this in various scripts and have it go unnoticed. 
Hawaii in the native language Hawaiian, which uses the Latin script, is spelled 
Hawaiʻi. If we write this as Hawai'i, using an apostrophe rather than the 
ʻokina character used for Polynesian languages in Latin script, we have now 
"duplicated" the string without using the same characters. Many would do this 
entirely unintentionally not knowing ʻokina is a different character, and then 
if someone wanted to correct the character in the termbox it is in, it would 
give an error.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mrephabricator
Cc: mrephabricator, Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, 
Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, 
ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-07-28 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, Manuel, Aklapper, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-07-14 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, Manuel, Aklapper, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-07-14 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, Manuel, Aklapper, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-07-14 Thread Manuel
Manuel updated the task description.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Manuel
Cc: Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, Manuel, Aklapper, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-06-29 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.


  In T306918#7884900 , 
@Lydia_Pintscher wrote:
  
  > Thing to consider: Should/could this be done with an abuse filter?
  
  I doubt this could be done with an AbuseFilter – I don’t think the filter 
gets access to the existing item data (the mul label) that wasn’t touched in 
the edit.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, Manuel, Aklapper, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-06-29 Thread Lucas_Werkmeister_WMDE
Lucas_Werkmeister_WMDE added a comment.


  I think we can implement this in much the same way as the blocking of 
identical labels and descriptions (T212869 
).

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: Lucas_Werkmeister_WMDE, Lydia_Pintscher, Nikki, Mahir256, Manuel, Aklapper, 
Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, 
Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-04-27 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.


  Thing to consider: Should/could this be done with an abuse filter?

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lydia_Pintscher
Cc: Lydia_Pintscher, Nikki, Mahir256, Manuel, Aklapper, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org


[Wikidata-bugs] [Maniphest] T306918: Prohibit duplication of mul labels in other languages

2022-04-26 Thread Maintenance_bot
Maintenance_bot added a project: Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T306918

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Maintenance_bot
Cc: Nikki, Mahir256, Manuel, Aklapper, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Lydia_Pintscher, Mbch331
___
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org