Re: printf, binary data, leading single quote and non ASCII codesets

alex xmb sw ratchev Mon, 24 Jun 2024 06:15:09 -0700

termux :

printf "$b" | { while LC_ALL=C read -N1 c; do      LC_ALL=C.UTF-8 printf
"%d %q\n" "'$c" "$c"; done; }
255 ÿ
190 $'\276'


On Mon, Jun 24, 2024, 1:37 PM Gioele Barabucci <gio...@svario.it> wrote:

> Hi,
>
> bash manpage says for printf contains the following statement:
>
> > if the leading character is a single or double quote, the value is
> > the ASCII value of the following character.
>
> POSIX uses a different wording:
>
> > If the leading character is a single-quote or double-quote, the value
> > shall be the numeric value in the underlying codeset of the character
> > following the single-quote or double-quote.
> Bash says that ASCII will always be used, POSIX says that the conversion
> is codeset-dependent.
>
> Bash code seems to agree with POSIX and contradict its manpage:
>
> $ printf -v b "\xc3\xbf\xbe"
>
> $ printf "$b" | { while LC_ALL=C read -N1 c; do \
>      LC_ALL=C printf "%d %q\n" "'$c" "$c"; done; }
> 195 $'\303'
> 191 $'\277'
> 190 $'\276'
>
> $ printf "$b" | { while LC_ALL=C read -N1 c; do \
>      LC_ALL=C.UTF-8 printf "%d %q\n" "'$c" "$c"; done; }
> 195 $'\303'
> 255 $'\277'
> 190 $'\276'
>
> Should the manpage be changed? Or the code modified to always use ASCII
> as the reference codeset?
>
> Regards,
>
> --
> Gioele Barabucci
>
>

Re: printf, binary data, leading single quote and non ASCII codesets

Reply via email to