Wait, I thought the recommended approach is to normalize first, then do
string processing later? Normalizing first will eliminate
inconsistencies of this sort, and allow string-processing code to use a uniform approach to handling the string. I don't think it's a good idea to manually deal with composed/decomposed issues within every individual
string function.


1. Problem: Normalization is not closed under almost all operations. E.g. concatenating two normalized strings does not guarantee the result is in normalized form.

2. Problem: Some unicode algorithms e.g. string comparison require a normalization step. It doesn't matter which form you use, but you have to pick one.

Now we could say that all strings passed to phobos have to be normalized as (say) NFC and that phobos function thus skip the normalization.

Reply via email to