mrephabricator created this task.
mrephabricator added projects: MediaWiki-Internationalization, 
MediaWiki-Parser, Wikidata, Wikidata Lexicographical data, I18n, 
MediaWiki-libs-utfnormal.
Restricted Application added a project: wdwb-tech.

TASK DESCRIPTION
  These nukta / bindi characters of the Gurmukhi Unicode block have precomposed 
forms, which the Unicode NFC normalization specification has exceptions for to 
decompose them to the "parent" character + nukta / bindi attaching character.
  
  ਸ਼ ਖ਼ ਲ਼ ਗ਼ ਫ਼ ਜ਼
  
  This apparently seems to be for purported backwards compatibility issues, but 
the current situation on the web is that the precomposed characters are 
preferred by most websites and databases which use Punjabi Gurmukhi. This is 
understandable, as these letters represent one single consonant each, and it is 
quite annoying for users to have to press backspace twice for them while not 
having to for others. Keyboard layouts tend to use the precomposed characters.
  
  The use of precomposed characters in URLs makes many Punjabi websites and 
external identifiers unlinkable from Wikimedia sites. For example, you can see 
here https://www.wikidata.org/wiki/Lexeme:L697770 the Sri Granth ID link which 
does exist is broken. Entering escape sequence manually in the property does 
not work either. This is a problem for the lexeme data itself as well, for 
reconciling against other databases, for transliteration to Shahmukhi 
(Perso-Arabic script), and for use with newer fonts which tend to operate under 
the assumption that people are using the preferred precomposed characters.
  
  I am not sure where the most effective and least controversial place to 
change this is. Would Unicode ever change this? Could this be changed in the 
NFC normalization library itself, or should it be changed on a case by case 
basis for inputs in Wikimedia projects where an override would be particularly 
warranted? Maybe someone here knows

TASK DETAIL
  https://phabricator.wikimedia.org/T317037

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mrephabricator
Cc: mrephabricator, Astuthiodit_1, Prufkick, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, Af420, 
GoranSMilovanovic, Mahir256, QZanden, LawExplorer, JJMC89, _jensen, rosalieper, 
Scott_WUaS, Srdjan, MuhammadShuaib, LNDDYL, Psychoslave, Nirmos, Cwek, 
Wikidata-bugs, aude, Dinoguy1000, Gryllida, Shizhao, Arrbee, KartikMistry, 
Arlolra, Jackmcbarn, Mbch331, Jay8g
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to