What do you mean by character?
- Glyph?
- Codepoint?

Do you have to perform some sort of canonicalization before counting?
Combining characters make this particularly difficult, which is why we settled on something easy to describe and understand in JIDs.

On Jun 24, 2007, at 7:39 AM, Matthias Wimmer wrote:

Hi Joe!

Joe Hildebrand schrieb:
+1 for limiting it.
However, 1024 octets please, rather than characters, like JIDs.

+1 for limiting it

... but please based on characters, not on octets. (I also voted against limiting JIDs based on octets.)

Reasons:
- Modern database systems as well as modern programming languages do store characters, not bytes. - XMPP is based on top of XML and XML does handle characters, not bytes. (e.g. you cannot store a NULL byte in XML, even not as an entity) - A limitation based on characters is what a user will expect. (e.g. "Why can I enter 1024 times the letter 'a' here but only 341 times the character €?") - In GUI forms you can often already limit the number of characters a user can enter, but mostly you cannot limit the number of octets the UTF-8 representation of the string the user has entered.

... I'd even propose that the JID limitation should be changed to characters in RFC3920bis.


Matthias

Reply via email to