On Thu, Oct 11, 2007 at 07:57:57PM -0400, [EMAIL PROTECTED] wrote:
> Heikki Linnakangas writes:
> >The only features in the printf-family of functions that depends on the
> >locale are the conversion with thousand grouping ("%'d"), and glibc
> >extension of using locale's alternative output digits ("%Id"). 
> 
> The entire parsing of the format string is affected by the multi-byte
> character encoding.  I don't know how GCC would be able tell that a byte
> with the same value as '%' in the middle of string would actually be
> interpreted as '%' character rather than a part of an extended multibyte
> character.  This can easily happen with the ISO 2022-JP encoding.
> 
>                                       Ross Ridge

Yes, and the ISO standard for C says that the compiler must be told what locale
to use when parsing string constants anyway, since the compiler must behave as
if it did a mbtowc on the source file.  For example, when I was on the ISO
X3J11 standards committee that eventually produced the C-90 standard, one of
the considerations was that it might be possible to have a multibyte encoding
that used " or ' as the second byte.  ISO 2022-JP was certainly one of the
encodings that were talked about in the meetings.

-- 
Michael Meissner, AMD
90 Central Street, MS 83-29, Boxborough, MA, 01719, USA
[EMAIL PROTECTED]


Reply via email to