Pádraig Brady <[email protected]> writes: > On 28/11/2025 20:05, Collin Funk wrote: >> Pádraig Brady <[email protected]> writes: >> >>> On 28/11/2025 00:57, Collin Funk wrote: >>>> +print_ver_ tac >>> >>> I'd also rely on printf >> Right. Don't we also need to use 'env' to make sure a shell builtin >> doesn't get used? >> >>>> + tac --separator=$(printf "$1") inp > out && printf '\n' >> out \ >>>> + || framework_failure_ >>> >>> Since we're testing tac, I'd explicitly check its return value. >>> Also this looks under quoted, so I'd change to: >>> >>> tac --separator="$(printf -- "$1")" inp > out || fail=1 >>> printf '\n' >> out || framework_failure_ >> Good catch. >> I'll probably add another test to make sure invalid UTF-8 is treated >> as >> bytes. >> Okay to move bad_unicode() from tests/fold/fold-characters.sh to >> init.cfg? I'm sure it will be useful for other tests as well, since it >> checks a few different ways UTF-8 can be bad. > > All sounds good.
Actually that idea doesn't work since bad_unicode emits a NUL. Using it with that removed, i.e., '\xC3|\xED\xBA\xAD|\u0089|\xED\xA6\xBF\xED\xBF\xBF\n', doesn't seem to work though, so I'll have to look into that. I pushed the attached without the bad unicode check regardless, since these cases are more likely to be used. Collin
>From 7d94684f2cc0e09aefeb505bfb171f7b7a21b4d5 Mon Sep 17 00:00:00 2001 Message-ID: <7d94684f2cc0e09aefeb505bfb171f7b7a21b4d5.1764365406.git.collin.fu...@gmail.com> From: Collin Funk <[email protected]> Date: Thu, 27 Nov 2025 16:55:18 -0800 Subject: [PATCH v2] test: tac: test with non-ASCII values for --separator * tests/tac/tac-locale.sh: New test. * tests/local.mk (all_tests): Add it. --- tests/local.mk | 1 + tests/tac/tac-locale.sh | 43 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) create mode 100755 tests/tac/tac-locale.sh diff --git a/tests/local.mk b/tests/local.mk index 26d140dcc..4ae003719 100644 --- a/tests/local.mk +++ b/tests/local.mk @@ -458,6 +458,7 @@ all_tests = \ tests/misc/sync.sh \ tests/tac/tac.pl \ tests/tac/tac-continue.sh \ + tests/tac/tac-locale.sh \ tests/tac/tac-2-nonseekable.sh \ tests/tail/tail.pl \ tests/misc/tee.sh \ diff --git a/tests/tac/tac-locale.sh b/tests/tac/tac-locale.sh new file mode 100755 index 000000000..2bb6e404c --- /dev/null +++ b/tests/tac/tac-locale.sh @@ -0,0 +1,43 @@ +#!/bin/sh +# Test that tac --separator=SEP works if SEP is not ASCII. + +# Copyright (C) 2025 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <https://www.gnu.org/licenses/>. + +. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src +print_ver_ tac printf + +check_separator () +{ + env printf "1$12$13$1" > inp || framework_failure_ + env printf "3$12$11$1\n" > exp || framework_failure_ + tac --separator="$(env printf -- "$1")" inp > out || fail=1 + env printf '\n' >> out || framework_failure_ + compare exp out || fail=1 +} + +export LC_ALL=en_US.iso8859-1 # only lowercase form works on macOS 10.15.7 +if test "$(locale charmap 2>/dev/null | sed 's/iso/ISO-/')" = ISO-8859-1; then + check_separator '\xe9' # é + check_separator '\xe9\xea' # éê +fi + +export LC_ALL=$LOCALE_FR_UTF8 +if test "$(locale charmap 2>/dev/null)" = UTF-8; then + check_separator '\u0434' # д + check_separator '\u0434\u0436' # дж +fi + +Exit $fail -- 2.52.0
