Eli Zaretskii <e...@gnu.org> writes: > Thanks. This improves things, but unfortunately not enough. The > reason is that call to setlocale at the beginning of locale_charset, > which is always made. If I replace that call with a constant string, > I get a 20-fold speedup, and the display of large nodes in the Info > reader becomes instantaneous, as it was in v5.2.
> Here are my timings, measured on Windows XPSP3, after enlarging the > loop count to 2000000, to get the fastest version out of the > quantization error of the system clock: Thanks for testing. IMO, the locale_charset call in wcwidth (and mbrtowc) and the setlocale call in locale_charset are inevitable to support per-thread locales, unless we are going to add a new API. However, I missed the fact that setlocale (LC_ALL, NULL) returns a concatenation of all category values on Windows. Perhaps the overhead could be halved by omitting the second call of setlocale (i.e. moving the cache lookup above setlocale (LC_CTYPE, ...))? Regards, -- Daiki Ueno