Ross Ridge writes:
 The entire parsing of the format string is affected by the multi-byte
> character encoding.  I don't know how GCC would be able tell that a byte
> with the same value as '%' in the middle of string would actually be
> interpreted as '%' character rather than a part of an extended multibyte
> character.  This can easily happen with the ISO 2022-JP encoding.

Michael Meissner writes:
> Yes, and the ISO standard for C says that the compiler must be told what
> locale to use when parsing string constants anyway, since the compiler
> must behave as if it did a mbtowc on the source file.

The compiler needs to know the source character set both to parse the
string literal and to translate it into the execution character set.
It doesn't need to know, nor can it generally know, the locale dependent
character set that the standard library will use when parsing printf
format strings.

                                Ross Ridge

Reply via email to