On Thursday, 13 October 2022 at 08:48:49 UTC, rikki cattermole
wrote:
On 13/10/2022 9:42 PM, bauss wrote:
Oh and to add onto this, IFF you have to do it the hacky way,
then converting to uppercase instead of lowercase should be
preferred, because not all lowercase characters can perform
round trip, although a small group of characters, then using
uppercase fixes it, so that's a relatively easy fix. A round
trip is basically converting characters from one culture to
another and then back. It's impossible with some characters
when converting to lowercase, but should always be possible
when converting to uppercase.
You will want to repeat this process with normalize to NFKC and
normalize to NFD before transforming. Otherwise there is a
possibility that you will miss some transformations as the
simplified mappings are 1:1 for characters and not everything
is representable as a single character.
Yeah, text isn't easy :D