MichaelSchoenitzer created this task. MichaelSchoenitzer added projects: Wikidata-Bridge, Lexicographical data. Restricted Application added a project: Wikidata.
TASK DESCRIPTION It is possible for Lexemes to have a soft hyphen U+00AD in the lemma. This might happen by accident when copy-pasting from other sources (for example some dictionaries). This causes troubles: This Lexeme (L34775) <https://www.wikidata.org/w/index.php?title=Lexeme:L34775&oldid=944847560> was a duplicate of this (L27002) <https://www.wikidata.org/wiki/Lexeme:L27002>. It was apparently created because the older one (L27002) couldn't be found. Merging them wasn't possible since the lemmas where different for the mergepage even through they seemed to be identically for the user. Proposes solution: Forbid or automatically remove soft hyphens and similar non visible unicode-characters. TASK DETAIL https://phabricator.wikimedia.org/T234136 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: MichaelSchoenitzer Cc: MichaelSchoenitzer, darthmon_wmde, Michael, DannyS712, Nandana, Mringgaard, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Wikidata-bugs, aude, Lydia_Pintscher, Darkdadaah, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs