On 18.12.19 16:00, Marvin W wrote:
> It's indeed a good question if anything in XMPP allows servers or
> in-between entities to do normalization. I was under the assumption that
> servers do not change the codepoints. In XML [1] Characters with
> multiple possible representations in ISO/IEC 10646 (e.g. characters with
> both precomposed and base+diacritic forms) match only if they have the
> same representation in both strings. Thus by XML specification,
> normalization is changing the body.

I am not sure if it is not a little bit far fetched to deduce from the
XML "string match" definition that XMPP entities are not provided with a
little bit of freedom to transform Unicode string representation within
a certain degree. At least I am currently missing the link from the XML
"string match" definition to "XMPP entities must use this when
serializing/de-serializing XML".

If we can make that link, then we do not need normalization. And we
probably want to clearly state that requirement in rfc6120bis, because
it is not obvious (at least for me).

> Also the main reason why we shouldn't ask for Unicode normalization to
> happen is that different Unicode version have different normalizations.> Thus 
> if the sender normalizes with Unicode version X and calculates
> offsets from that, then receiver normalizes with Unicode version Y and
> determines the offsets there, they can end up in pointing to different
> characters.

We need Unicode agility anyway in XMPP, which I do not believe to be a
big issue. Especially since Unicode is likely to introduce lesser
changes with every future standard version.

- Florian

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Standards mailing list
Info: https://mail.jabber.org/mailman/listinfo/standards
Unsubscribe: standards-unsubscr...@xmpp.org
_______________________________________________

Reply via email to