On 18.12.19 16:00, Marvin W wrote: > It's indeed a good question if anything in XMPP allows servers or > in-between entities to do normalization. I was under the assumption that > servers do not change the codepoints. In XML [1] Characters with > multiple possible representations in ISO/IEC 10646 (e.g. characters with > both precomposed and base+diacritic forms) match only if they have the > same representation in both strings. Thus by XML specification, > normalization is changing the body.
I am not sure if it is not a little bit far fetched to deduce from the XML "string match" definition that XMPP entities are not provided with a little bit of freedom to transform Unicode string representation within a certain degree. At least I am currently missing the link from the XML "string match" definition to "XMPP entities must use this when serializing/de-serializing XML". If we can make that link, then we do not need normalization. And we probably want to clearly state that requirement in rfc6120bis, because it is not obvious (at least for me). > Also the main reason why we shouldn't ask for Unicode normalization to > happen is that different Unicode version have different normalizations.> Thus > if the sender normalizes with Unicode version X and calculates > offsets from that, then receiver normalizes with Unicode version Y and > determines the offsets there, they can end up in pointing to different > characters. We need Unicode agility anyway in XMPP, which I do not believe to be a big issue. Especially since Unicode is likely to introduce lesser changes with every future standard version. - Florian
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________