On Wednesday, 27 November 2013 at 14:45:32 UTC, David Nadlinger
wrote:
If you need to perform this kind of operations on Unicode
strings in D, you can call normalize (std.uni) on the string
first to make sure it is in one of the Normalization Forms. For
example, just appending .normalize to your strings (which
defaults to NFC) would make the code produce the "expected"
results.
Seems like a pretty big "gotcha" from a usability standpoint;
it's not exactly intuitive. I understand WHY this decision was
made, but it feels like a source of code smell and weird string
comparison errors.
As far as I'm aware, this behavior is the result of a
deliberate decision, as normalizing strings on the fly isn't
really cheap.
I don't remember if it was brought up before, but this makes me
wonder if something like an i18nString should exist for cases
where it IS important. Making i18n stuff as simple as it looks
like it "should" be has merit, IMO. (Maybe there's even room for
a std.string.i18n submodule?)
-Wyatt