On 27/08/2025 14:54, Pádraig Brady wrote:
On 23/08/2025 06:28, Collin Funk wrote:
Pádraig Brady <[email protected]> writes:
On the subject of testing, this is one place where the i18n patch lacked,
though it does have some adjustments for testing fold.
It would be good to incorporate it's tests/.../fold.pl adjustments.
I've just done this in the attached 2 patches,
which I'll push in a bit.
Also it would be good to add tests for invalid multi-byte characters
to see that they're handled appropriately.
It looks like we'll need to adjust the code to handle invalid chars
appropriately,
(and add tests). The following shows how upstream and i18n patch fold
treat invalid utf8 char \xC3 :
$ for fold in src/fold /bin/fold; do
for locale in C en_US.UTF-8; do
echo "LC_ALL=$locale $fold"
printf '\xC3' | LC_ALL=$locale $fold -w1 | od -Ax -tx1z -v | head -n1
done
done
LC_ALL=C src/fold
000000
LC_ALL=en_US.UTF-8 src/fold
000000
LC_ALL=C /bin/fold
000000 c3 >.<
LC_ALL=en_US.UTF-8 /bin/fold
000000 c3 >.<
I suppose a concrete way to test that might be:
# https://datatracker.ietf.org/doc/rfc9839/ bad_unicode() { printf
'\xC3|\u0000|\u0089|\uDEAD|\uD9BF\uDFFF\n'; } test $({ bad_unicode | fold;
bad_unicode; } | uniq | wc -l) = 1 || fail=1
cheers,
Padraig