On Sun, Feb 04, 2024 at 12:17:16PM +0100, Patrice Dumas wrote:
> Here is my updated thinking on the possibilities
>
> 1) lexicographic sorting on unicode strings (corresponds to
> USE_UNICODE_COLLATION=0 currently)
> 2) unicode default sorting obtained by Unicode::Collate in Perl and
> strxfrm_l in C with "en_US.utf-8", the current default ("en_US.utf-8"
> could be different on different platforms, a list instead of only one
> possibility if "en_US.utf-8" is not always available...)
> 3) sorting based on @documentlanguage using, in perl
> Unicode::Collate::Locale with locale @documentlanguage and in C
> strxfrm_l with "@documentlanguage.utf-8" (at least on GNU/Linux,
> the locale name setup for strxfrm_l could be different on other platforms).
> 4) sorting based on a customization variable, such as COLLATION_LANGUAGE.
> it would be the same as the previous one, with @documentlanguage
> replaced by COLLATION_LANGUAGE.
> 5) sorting based on the user locale, using strxfrm in C and
> "use locale" and regular sorting on unicode (internal perl encoded) strings
> in Perl.
I forgot about one possibility, until there is a possibility to have
Non-ignorable Weighting in C it could make sense to have as another
possibility for C, the possibility to call perl code to obtain 2), which
would lead to
6) in C use Perl sorting corresponding to 2).
Could be named 'perldefault'.
The possibility to use Perl sorting corresponding to 2) in C is already
implemented, and currently used if TEST=1.
--
Pat