termux : printf "$b" | { while LC_ALL=C read -N1 c; do LC_ALL=C.UTF-8 printf "%d %q\n" "'$c" "$c"; done; } 255 ÿ 190 $'\276'
On Mon, Jun 24, 2024, 1:37 PM Gioele Barabucci <gio...@svario.it> wrote: > Hi, > > bash manpage says for printf contains the following statement: > > > if the leading character is a single or double quote, the value is > > the ASCII value of the following character. > > POSIX uses a different wording: > > > If the leading character is a single-quote or double-quote, the value > > shall be the numeric value in the underlying codeset of the character > > following the single-quote or double-quote. > Bash says that ASCII will always be used, POSIX says that the conversion > is codeset-dependent. > > Bash code seems to agree with POSIX and contradict its manpage: > > $ printf -v b "\xc3\xbf\xbe" > > $ printf "$b" | { while LC_ALL=C read -N1 c; do \ > LC_ALL=C printf "%d %q\n" "'$c" "$c"; done; } > 195 $'\303' > 191 $'\277' > 190 $'\276' > > $ printf "$b" | { while LC_ALL=C read -N1 c; do \ > LC_ALL=C.UTF-8 printf "%d %q\n" "'$c" "$c"; done; } > 195 $'\303' > 255 $'\277' > 190 $'\276' > > Should the manpage be changed? Or the code modified to always use ASCII > as the reference codeset? > > Regards, > > -- > Gioele Barabucci > >