On 23/09/2025 18:41, Collin Funk wrote:
Pádraig Brady <[email protected]> writes:
I know I'm like a broken record, but testing is especially
important for any multi-byte changes, even as a way
to document what we don't support.
Demanding more testing isn't a bad thing. :)
In this case (pardon the pun) the tests could be expanded
to cover a sampling of cases from
https://unicode.org/Public/UNIDATA/SpecialCasing.txt
Ah, I did not know about that document. That is helpful, thanks.
It's worth looking at the old discussion re join
(where it was mentioned that join/sort/uniq should be treated as a unit
so that there is consistent interaction between them):
https://crashcourse.housegordon.org/coreutils-multibyte-support.html
https://lists.gnu.org/archive/html/bug-coreutils/2009-03/msg00102.html
https://lists.gnu.org/archive/html/coreutils/2010-09/msg00029.html
My thinking was that it is easier to follow changes when they only
affect one program. If we find that part of uniq, for example, can be
used again in join, then we can move it to a module in gl/ once that
program is addressed.
That seems easier to review than a single massive patch, IMO.
Agreed.
Also it's better to incrementally improve rather than
try and be perfect first time around.
cheers,
Padraig