On Sun, Mar 13, 2011 at 10:27:05AM -0500, Craig A. Berry wrote:

> So the problems are all with 0xbf, 0xf7, and 0x88.  These don't look
> to me like they're in the portable character set:
> 
> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap06.html
> 
> so I think what happens to them in the C/POSIX locale is undefined.

"unspecified":

    The tables in Locale Definition describe the characteristics and
    behavior of the POSIX locale for data consisting entirely of characters
    from the portable character set and the control character set. For other
    characters, the behavior is unspecified.

http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html#tag_07_02

No, I didn't know that. Wonderful. So anything outside of ASCII can't be
trusted. (But at least doesn't summon nasal daemons)

> I'm guessing these are continuation bytes of multi-byte sequences?  I
> didn't think the C locale knew anything about such things.

I thought that too. Seems that I'm wrong. It's merely "unspecified".

Nicholas Clark

Reply via email to