Lucas_Werkmeister_WMDE added a comment.

  In T327514#8630437 <https://phabricator.wikimedia.org/T327514#8630437>, 
@Nikki wrote:
  
  > In T327514#8598366 <https://phabricator.wikimedia.org/T327514#8598366>, 
@ItamarWMDE wrote:
  >
  >> - In order to protect against malicious queries (these should never be in 
real sitelink URLs), don’t decode (or, re-encode after decoding) stuff like
  >>   - whitespace characters
  >>   - control characters
  >
  > Some of those characters are required by other scripts and therefore do 
appear in real sitelink URLs. Zero-width joiner 
<https://en.wikipedia.org/wiki/Zero-width_joiner> and zero-width non-joiner 
<https://en.wikipedia.org/wiki/Zero-width_non-joiner> in particular can be 
relatively common in Arabic and Indic scripts, and `select * { ?sitelink 
schema:isPartOf <https://fa.wikisource.org/> } limit 1000` includes quite a few 
zero-width non-joiners.
  
  Hm, true, this doesn’t look nice :/
  F36863674: image.png <https://phabricator.wikimedia.org/F36863674>
  Let me see if there’s a more restrictive Unicode category we can use.

TASK DETAIL
  https://phabricator.wikimedia.org/T327514

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lucas_Werkmeister_WMDE
Cc: ItamarWMDE, Aklapper, Arian_Bozorg, Nikki, Sarai-WMDE, Astuthiodit_1, 
AWesterinen, karapayneWMDE, Invadibot, MPhamWMF, maantietaja, CBogen, 
Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, 
GoranSMilovanovic, Mahir256, QZanden, EBjune, merbst, LawExplorer, Salgo60, 
_jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to