Lucas_Werkmeister_WMDE added a comment.

  > Let me see if there’s a more restrictive Unicode category we can use.
  
  Not really – ZWJ/ZWNJ are in Other, format (Cf) 
<https://www.fileformat.info/info/unicode/category/Cf/list.htm> together with 
the directional control characters (U+202E RIGHT-TO-LEFT OVERRIDE and friends), 
which I don’t think we want to allow in decoded form.
  
  MediaWiki core’s MediaWikiTitleCodec::splitTitleString() 
<https://gerrit.wikimedia.org/g/mediawiki/core/+/3cc288eac4/includes/title/MediaWikiTitleCodec.php#369>
 hard-codes the bidi characters as forbidden: U+200E-F and U+202A-E. I guess we 
could do the same, and re-encode those seven while allowing the rest of the 
`Cf` category? (But still blocking the other “other” categories: `Cc` Other, 
control; `Cs` Other, surrogate; `Co` Other, private use; and `Cn`, Other, not 
assigned.)
  
  (MediaWiki //allows// the bidi //isolate// characters in titles, and indeed 
U+2066 <https://en.wikipedia.org/wiki/⁦> is a working redirect on enwiki. I’m 
not sure how I feel about that tbh.)

TASK DETAIL
  https://phabricator.wikimedia.org/T327514

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: ItamarWMDE, Aklapper, Arian_Bozorg, Nikki, Sarai-WMDE, Astuthiodit_1, 
AWesterinen, karapayneWMDE, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Mahir256, QZanden, EBjune, merbst, LawExplorer, Salgo60, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to