On 11/11/2025 04:52, Grisha Levit wrote:
On Mon, Nov 10, 2025, 19:41 Pádraig Brady <[email protected]> wrote:

* tests/sort/sort-locale.sh: Check sort,ls have the same (non C) order.
* tests/local.mk: Reference the new test.
---
  tests/local.mk            |  1 +
  tests/sort/sort-locale.sh | 51 +++++++++++++++++++++++++++++++++++++++
  2 files changed, 52 insertions(+)
  create mode 100755 tests/sort/sort-locale.sh

diff --git a/tests/local.mk b/tests/local.mk
index 53fc53e00..73060f10c 100644
--- a/tests/local.mk
+++ b/tests/local.mk
@@ -413,6 +413,7 @@ all_tests =                                 \
    tests/sort/sort-files0-from.pl               \
    tests/sort/sort-float.sh                     \
    tests/sort/sort-h-thousands-sep.sh           \
+  tests/sort/sort-locale.sh                    \
    tests/sort/sort-merge.pl                     \
    tests/sort/sort-merge-fdlimit.sh             \
    tests/sort/sort-month.sh                     \
diff --git a/tests/sort/sort-locale.sh b/tests/sort/sort-locale.sh
new file mode 100755
index 000000000..f082a1737
--- /dev/null
+++ b/tests/sort/sort-locale.sh
@@ -0,0 +1,51 @@
+#!/bin/sh
+# Test sort locale ordering
+
+# Copyright (C) 2025 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src
+print_ver_ sort ls
+
+check_different_from_C() {
+  test "$(printf '%s\n' "$1" "$2" | LC_ALL=C sort)" != \
+       "$(printf '%s\n' "$1" "$2" | sort)"
+}
+
+check_hard_collate() {
+  # Correlate with ls
+  touch "$1" "$2" || framework_failure_
+  test "$(printf '%s\n' "$1" "$2" | sort)" = "$(ls -1 "$1" "$2")" || fail=1
+
+  if test "$fail" != 1; then
+    check_different_from_C "$1" "$2" ||
+      skip_ "Strings '$1' '$1' sort the same in C and $LC_ALL locales"

Should probably be "Strings '$1' '$2' ..."

ack


+  fi
+}
+
+export LC_ALL=en_US.iso8859-1  # only lowercase form works on macOS 10.15.7
+if test "$(locale charmap 2>/dev/null | sed 's/iso/ISO-/')" = ISO-8859-1; then
+  check_hard_collate 'a_a' 'a b'  # underscore and space considered equal
+  check_hard_collate 'aaa' 'BBB'  # case insensitive ordering
+  check_hard_collate "$(printf 'aa\xe9')" 'aaF'  # é comes before f

Should 'aaF' be 'aaf' to match the comment?

ack


+fi
+
+export LC_ALL=$LOCALE_FR_UTF8

Should there be a `test "$LOCALE_FR_UTF8" != "none"' first?

That was on purprose.
With "none" the following charmap==UTF-8 check doesn't pass on glibc.
I.e. that charmap check is enough to validate we're in a UTF-8 locale,
which might happen even with "none" on systems that are exclusively UTF-8.

+if test "$(locale charmap 2>/dev/null)" = UTF-8; then
+  check_hard_collate 'aaé' 'aaf'  # é comes before f
+  check_hard_collate 'aéY' "$(printf 'ae\u0301Z')"  # NFD/NFC é are equal
+fi

I've pushed with your two tweaks.

thanks for the review.
Padraig

Reply via email to