On Thu, 16 Nov 2017 11:27:45 -0700, "Theo de Raadt" wrote:

> Yes, I already proposed that someone made a mistake a while ago.

This was added in NetBSD in 1995:

----------------------------
revision 1.17
date: 1995/05/02 19:52:41;  author: jtc;  state: Exp;  lines: +15 -8;
The C Standard says that printf's format string is a multi-byte
character string.  NA1 says that the 99 characters required by the
Standard have representations in the initial state which are one byte
long and do not alter the state.

Thus we can safely break apart the format string with mbtowc() until
we reach a '%' character, and the process format directive characters
one by one.

We really shouldn't be using mbtowc(), rather mbrtowc() (which takes a
mbstate-t argument) but we don't have the NA1 functions implemented
yet.  This is safe, because even when we do we're not likely to
support multi-byte character encodings that use shift states.
----------------------------

The change was never adopted by FreeBSD and modern NetBSD doesn't
include do it either.  There is nothing in C99 that I can find to
indicate that the format string is multi-byte.  Either this part
of NA1 was not adopted as part of C99 or jtc misread the standard.

I've done a brief survey using the test program at the end of
this message.  Here are the results:

OpenBSD:
    mbrtowc fail (expected)
    printf fail (unexpected)
Linux:
    mbrtowc fail (expected)
    printf OK, ret 1 (expected)
macOS:
    mbrtowc fail (expected)
    printf OK, ret 1 (expected)
Solaris 11:
    mbrtowc fail (expected)
    printf OK, ret 1 (expected)

OpenBSD is the outlier here.  If everyone else interprets the
standard differently that we do, I think it is reasonable to say
that our interpretation in incorrect.  Furthermore, the source of
that change (NetBSD) no longer includes it.

 - todd

Reply via email to