Your message dated Tue, 14 Feb 2012 17:01:36 +0100
with message-id <[email protected]>
and subject line printf %s in UTF-8 is not POSIX-compliant
has caused the Debian Bug report #469235,
regarding ksh: printf %s in UTF-8 is not POSIX-compliant
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
469235: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=469235
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: ksh
Version: 93s+20080202-1
Severity: normal
Under UTF-8 locales:
vin:~> ksh93
$ printf ".%2s.\n" é
. é.
$ /usr/bin/printf ".%2s.\n" é
.é.
$ printf ".%.2s.\n" éabc
.éa.
$ /usr/bin/printf ".%.2s.\n" éabc
.é.
$
As you can see, the ksh93 printf builtin doesn't behave like the
coreutils printf, and this is ksh93 which is wrong. Indeed, the
field width and the precision are number of bytes, not number of
characters.
http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html
says (in the extended description) that the "file format notation"
shall be used for the format (and %s isn't an exception).
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap05.html
(file format notation) says:
field width
An optional string of decimal digits to specify a minimum field
width. For an output field, if the converted value has fewer
bytes than the field width, [...]
^^^^^
and
s
The argument shall be taken to be a string and bytes from the
string shall be written until the end of the string or the number
of bytes indicated by the precision specification of the argument
is reached. If the precision is omitted from the argument, it
shall be taken to be infinite, so all bytes up to the end of the
string shall be written.
Note: zsh has the same bug, but not pdksh and bash.
Some information for bash:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=459413
-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.24.3-20080226 (SMP w/2 CPU cores; PREEMPT)
Locale: LANG=POSIX, LC_CTYPE=en_US.ISO8859-1 (charmap=ISO-8859-1)
Shell: /bin/sh linked to /bin/bash
Versions of packages ksh depends on:
ii libc6 2.7-9 GNU C Library: Shared libraries
ksh recommends no packages.
-- no debconf information
--- End Message ---
--- Begin Message ---
This bug appears to be fixed in the latest release of ksh. It now
supports the L modifier to work on characters instead of bytes.
Oliver
--- End Message ---