On Thu, 3 Dec 2009, Eric Blake wrote:

Thomas Dickey <dickey <at> his.com> writes:

This means that characters 0..127 have to be treated as ASCII, but

No, it means that portable characters and control characters must be < 128.
ASCII meets this characteristic, but so does EBCDIC, as well as UTF-8.  The C
locale also implies that you can manipulate bytes >= 128 in the naive manner,
so long as you don't care about characters embedded in those bytes.  And what
do you know - ASCII, EBCDIC, and UTF-8 all meet this property, too.

beyond that an implementation can do what it wants. And on Cygwin 1.7,
plain "C" actually does imply UTF-8, which happily is
backward-compatible with ASCII.

That's an interpretation that so far hasn't been blessed by the standards
people.  Any discussion of this topic should mention that, as a caveat.

Actually, the standards people HAVE spoken - and they agreed with our
interpretation.  POSIX was INTENTIONALLY written with the intent that a UTF-8
encoding is valid for the C locale, for the same reason that it was written
that an EBCDIC encoding is valid for the C locale.  These emails from the
Austin Group (the folks that write POSIX) are telling:


This is basically your email on the matter.


But they also admitted that there is still more work needed in POSIX to make
this intent clearly codified (for example, that control characters must be
single bytes < 128).

But they have not actually agreed with you yet.

Thomas E. Dickey

Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://x.cygwin.com/docs/
FAQ:                   http://x.cygwin.com/docs/faq/

Reply via email to