Hi!

Bison uses gnulib's unicodeio module to emit bullets (•) portably,
with a fallback to '.'.  It's implemented this way (src/gram.h):

> /* Fallback in case we can't print "•".  */
> static inline long
> print_dot_fallback (unsigned int code _GL_UNUSED,
>                     const char *msg _GL_UNUSED,
>                     void *callback_arg)
> {
>   FILE *out = (FILE *) callback_arg;
>   putc ('.', out);
>   return -1;
> }
> 
> /* Print "•", the symbol used to represent a point in an item (aka, a
>    dotted rule).  */
> static inline void
> print_dot (FILE *out)
> {
>   unicode_to_mb (0x2022, fwrite_success_callback, print_dot_fallback, out);
> }

Unfortunately on Kiyoshi's environment (SunOS hidden 5.11 11.3 i86pc i386 i86pc,
GCC 9.3.0) we get '?' instead of '.' in the C locale.  We get a genuine ASCII
'?', it's not some fallback from the terminal which fails to display the
character.  And we properly get the bullet with en_US.UTF-8.

Kiyoshi can reproduce the problem with GNU Coreutils' printf, where he
get's a '?', although the fallback display the escape sequence (i.e.,
it should repeat '\u2022'):

> /* Simple failure callback that displays a fallback representation in plain
>    ASCII, using the same notation as ISO C99 strings.  */
> static long
> fallback_failure_callback (unsigned int code,
>                            const char *msg _GL_UNUSED,
>                            void *callback_arg)
> {
>   FILE *stream = (FILE *) callback_arg;
> 
>   if (code < 0x10000)
>     fprintf (stream, "\\u%04X", code);
>   else
>     fprintf (stream, "\\U%08X", code);
>   return -1;
> }
> 
> /* Outputs the Unicode character CODE to the output stream STREAM.
>    Upon failure, exit if exit_on_error is true, otherwise output a fallback
>    notation.  */
> void
> print_unicode_char (FILE *stream, unsigned int code, int exit_on_error)
> {
>   unicode_to_mb (code, fwrite_success_callback,
>                  exit_on_error
>                  ? exit_failure_callback
>                  : fallback_failure_callback,
>                  stream);
> }



Kiyoshi's messages start here:

https://lists.gnu.org/r/bug-bison/2020-07/msg00001.html

The latest:

> Le 6 juil. 2020 à 22:35, Kiyoshi KANAZAWA <yoi_no_myou...@yahoo.co.jp> a 
> écrit :
> 
> Hi Akim,
> 
> $ LC_ALL=C $coreutilsbin/printf '\u2022\n' | od -t x1
> 0000000 3f 0a
> 0000002
> 
> $ LC_ALL=en_US.UTF-8 $coreutilsbin/printf '\u2022\n' | od -t x1
> 0000000 e2 80 a2 0a
> 0000004
> 
> 
> FYI, I have very limited locale.
> $ locale -a
> C
> POSIX
> en_US.ISO8859-1
> en_US.ISO8859-15
> en_US.ISO8859-15@euro
> en_US.UTF-8
> ja_JP.PCK
> ja_JP.UTF-8
> ja_JP.UTF-8@cldr
> ja_JP.eucJP

I'm unsure what the next steps would be from here.

Thanks in advance!

Reply via email to