On Oct 3, 2012, at 09:53, Richard Frith-Macdonald <rich...@tiptree.demon.co.uk> wrote: > > So I'm not sure what to do ... the C standards have changed from working with > characters to working with bytes (which is good),
Well, no. In the C standard, "character" generally means the same thing as "byte" (i.e., a value that can fit in a char). In point of fact, the standard provides two conflicting normative definitions of "character" (one marked <abstract>, the other <C>), but in the specification [f]printf() it seems character = byte is what is meant. Both the C99 and C11 final drafts have a footnote saying "No special provisions are made for multibyte characters." The sentence "In no case is a partial multibyte character written." only applies to %ls format, i.e. when converting a wchar_t* string into a possibly multi-byte sequence for a char* string. The closest analogue to NSString formatting is using %s in [f]wprintf(). In this case, characters (i.e., bytes) from the string are converted "as if by repeated calls to the mbrtowc function" (with sane initial state), and the precision limits the number of wide characters to be written. This is unproblematic because wchar_ts are required to be complete code units, but Foundation unichars can be UTF-16 surrogates, so this still doesn't resolve the issue. In summary, "figure out what Cocoa does." :-) -- Jens Ayton _______________________________________________ Gnustep-dev mailing list Gnustep-dev@gnu.org https://lists.gnu.org/mailman/listinfo/gnustep-dev