On 2/19/12 at 21:45, al...@wirfs-brock.com (Allen Wirfs-Brock) wrote:

I really don't think any Unicode semantics should be build into the basic string representation. We need to decide on a max element size and Unicode motivates 21 bits, but it could be 32-bits. Personally, I've lived through enough address space exhaustion episodes in my career be skeptical of "small" values like 2^21 being good enough for the long term.

Can we future-proof any limit an implementation may chose by saying that all characters whose code point is too large for a particular implementation must be replaced by an "invalid character" code point (which fits into the implementation's representation size) on input? An implementation which chooses 21 bits as the size will become obsolete when Unicode characters that need 22 bits are defined. However it will still work with characters that fit in 21 bits, and will do something rational with ones that do not. Users who need characters in the over 21 bit set will be encouraged to upgrade.

Cheers - Bill

-----------------------------------------------------------------------
Bill Frantz        | If the site is supported by  | Periwinkle
(408)356-8506 | ads, you are the product. | 16345 Englewood Ave www.pwpconsult.com | | Los Gatos, CA 95032

_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to