Hi! Bison uses gnulib's unicodeio module to emit bullets (•) portably, with a fallback to '.'. It's implemented this way (src/gram.h):
> /* Fallback in case we can't print "•". */ > static inline long > print_dot_fallback (unsigned int code _GL_UNUSED, > const char *msg _GL_UNUSED, > void *callback_arg) > { > FILE *out = (FILE *) callback_arg; > putc ('.', out); > return -1; > } > > /* Print "•", the symbol used to represent a point in an item (aka, a > dotted rule). */ > static inline void > print_dot (FILE *out) > { > unicode_to_mb (0x2022, fwrite_success_callback, print_dot_fallback, out); > } Unfortunately on Kiyoshi's environment (SunOS hidden 5.11 11.3 i86pc i386 i86pc, GCC 9.3.0) we get '?' instead of '.' in the C locale. We get a genuine ASCII '?', it's not some fallback from the terminal which fails to display the character. And we properly get the bullet with en_US.UTF-8. Kiyoshi can reproduce the problem with GNU Coreutils' printf, where he get's a '?', although the fallback display the escape sequence (i.e., it should repeat '\u2022'): > /* Simple failure callback that displays a fallback representation in plain > ASCII, using the same notation as ISO C99 strings. */ > static long > fallback_failure_callback (unsigned int code, > const char *msg _GL_UNUSED, > void *callback_arg) > { > FILE *stream = (FILE *) callback_arg; > > if (code < 0x10000) > fprintf (stream, "\\u%04X", code); > else > fprintf (stream, "\\U%08X", code); > return -1; > } > > /* Outputs the Unicode character CODE to the output stream STREAM. > Upon failure, exit if exit_on_error is true, otherwise output a fallback > notation. */ > void > print_unicode_char (FILE *stream, unsigned int code, int exit_on_error) > { > unicode_to_mb (code, fwrite_success_callback, > exit_on_error > ? exit_failure_callback > : fallback_failure_callback, > stream); > } Kiyoshi's messages start here: https://lists.gnu.org/r/bug-bison/2020-07/msg00001.html The latest: > Le 6 juil. 2020 à 22:35, Kiyoshi KANAZAWA <yoi_no_myou...@yahoo.co.jp> a > écrit : > > Hi Akim, > > $ LC_ALL=C $coreutilsbin/printf '\u2022\n' | od -t x1 > 0000000 3f 0a > 0000002 > > $ LC_ALL=en_US.UTF-8 $coreutilsbin/printf '\u2022\n' | od -t x1 > 0000000 e2 80 a2 0a > 0000004 > > > FYI, I have very limited locale. > $ locale -a > C > POSIX > en_US.ISO8859-1 > en_US.ISO8859-15 > en_US.ISO8859-15@euro > en_US.UTF-8 > ja_JP.PCK > ja_JP.UTF-8 > ja_JP.UTF-8@cldr > ja_JP.eucJP I'm unsure what the next steps would be from here. Thanks in advance!