> From: Gavin Smith <[email protected]> > Date: Mon, 20 Apr 2026 17:35:18 +0100 > Cc: [email protected], [email protected], [email protected], [email protected] > > AFAIK strcmp on UTF-8 encoded strings will return a value with the same > sign as wcscmp called on an equivalent wide character string
I means wcscoll, not wcscmp, sorry. IOW, since we are talking about sorting, the right solution for locale-specific sorting is to use collation, and that depends on the locale. For example, in some locales B is between a and b, as you probably well know. > so there is likely no bug here in gawk. (This is assuming that the > possible character values are represented in codepoint order in > wchar_t, e.g. as an integer giving the codepoint value.) But locale-specific sorting is not necessarily in the Unicode codepoint order, is it? And a manual written in a given language probably assumes sorting used in that language's locale, right?
