On Mon, Oct 09, 2023 at 11:32:49PM +0200, Bruno Haible wrote: > Gavin Smith wrote: > > It is supposed to attempt to force the locale to a UTF-8 locale. You > > can see the code in xspara_init that attempts to change the locale. There > > is also a comment before xspara_add_text: > > > > "This function relies on there being a UTF-8 locale in LC_CTYPE for > > mbrtowc to work correctly." > > That's an inherently unportable thing. You can't just force an UTF-8 > locale if the system does not have it.
The module shouldn't load if it can't switch to a UTF-8 locale. xspara_init returns a different value if these attempts fail leading the code loading the module (in Texinfo::XSLoader) to fall back to the pure Perl version. > In summary: On mingw there is no UTF-8 locale, and you cannot force it. > > The only portable way out is to use iconv() instead of setlocale(), mbrtowc(), > etc. This is how e.g. gettext's PO parser does it: > https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gettext-tools/src/po-lex.c;h=22d08849206b812b18ace9de7629bb95a9d71c3c;hb=c9af3e4eeccc178a0833754e3d8c7083591e75ba > lines 127..595. You will also find a function mb_width in there. > > Note that switching from locales to hand-written encoding support is a > major change; I wouldn't recommend to do it before Texinfo 7.1. > > Bruno It would be good to get away from the attempts to switch to a UTF-8 locale but I doubt it is urgent to do before the release, as the current approach, however flawed, has been in place and worked fairly well for a long time (since the XS paragraph module was written). At the time it seemed to be the only way to get the information from wcwidth.