Dan Sugalski writes: : Have they changed that again? Last I checked, UTF-8 was capped at 4 bytes, : but that's in the Unicode 3.0 standard. Doesn't really matter where they install the artificial cap, because for philosophical reasons Perl is gonna support larger values anyway. It's just that 4 bytes of UTF-8 happens to be large enough to represent anything UTF-16 can represent with surrogates. So they refuse to believe in anything longer than 4 bytes, even though the representation can be extended much further. (Perl 5 extends it all the way to 64-bit values, represented in 13 bytes!) They also arbitrarily define UTF-32 to not use higher values than 0x10ffff, but that doesn't mean we're gonna send in the high-bit Nazis if people want higher values for their own purposes. But since the names UTF-8 and UTF-32 are becoming associated with those arbitrary restrictions, it's getting even more important to refer to Perl's looser style as utf8 (and, potentially, utf32). I don't know if Perl will have a utf16 that is distinguised from UTF-16. Larry
- Re: Should we care much about this Unicode-ish crit... Bryan C . Warnock
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish crit... Dan Sugalski
- Re: Should we care much about this Unicode-ish crit... Bryan C . Warnock
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish criticism... Simon Cozens
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- RE: Should we care much about this Unicode-ish criticism... NeonEdge
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish criticism... Larry Wall
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- Re: Should we care much about this Unicode-ish criticism... Jarkko Hietaniemi
- Re: Should we care much about this Unicode-ish criticism... Dan Sugalski
- Re: Should we care much about this Unicode-ish criticism... Larry Wall
- Re: Should we care much about this Unicode-ish criticism... Larry Wall
- Re: Should we care much about this Unicode-ish criticism... Russ Allbery
- RE: Should we care much about this Unicode-ish criticism... NeonEdge
- Re: Should we care much about this Unicode-ish criticism... Simon Cozens