Bug#1068889: coreutils: join -t refuses single-character delimiters as "multi-character tab"s

2024-04-12 Thread Pádraig Brady

On 12/04/2024 23:42, наб wrote:

Package: coreutils
Version: 9.1-1
Version: 9.4-1
Severity: normal

Dear Maintainer,

Yes good:
$ cat f1
row1f1  1
urow1   f1  2
$ cat f2
row1f2  1
urow2   f2  2
$ join f? -t '  '
row1f1  1   f2  1

Not good:
$ cat f1
row1ąf1ą1
urow1ąf1ą2
$ cat f2
row1ąf2ą1
urow2ąf2ą2
$ join f? -t 'ą'
join: multi-character tab ‘ą’

Compare manual:
-t CHAR use CHAR as input and output field separator

Compare POSIX.1-202x/D3, XCU, join, OPTIONS:
−t char
Use character char as a separator, for both input and 
output. Every appearance of
char in a line shall be significant. When this option 
is specified, the collating
sequence shall be the same as sort without the −b 
option.


Please try coreutils 9.5 which has improved multi-byte char support in join



Bug#1068889: coreutils: join -t refuses single-character delimiters as "multi-character tab"s

2024-04-12 Thread наб
Package: coreutils
Version: 9.1-1
Version: 9.4-1
Severity: normal

Dear Maintainer,

Yes good:
$ cat f1
row1f1  1
urow1   f1  2
$ cat f2
row1f2  1
urow2   f2  2
$ join f? -t '  '
row1f1  1   f2  1

Not good:
$ cat f1
row1ąf1ą1
urow1ąf1ą2
$ cat f2
row1ąf2ą1
urow2ąf2ą2
$ join f? -t 'ą'
join: multi-character tab ‘ą’

Compare manual:
-t CHAR use CHAR as input and output field separator

Compare POSIX.1-202x/D3, XCU, join, OPTIONS:
−t char
Use character char as a separator, for both input and 
output. Every appearance of
char in a line shall be significant. When this option 
is specified, the collating
sequence shall be the same as sort without the −b 
option.

Best,
наб

-- System Information:
Debian Release: 12.4
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 
'stable-debug'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.1.0-12-amd64 (SMP w/24 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_FIRMWARE_WORKAROUND, 
TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_GB:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages coreutils depends on:
ii  libacl1  2.3.1-3
ii  libattr1 1:2.5.1-4
ii  libc62.36-9+deb12u4
ii  libgmp10 2:6.2.1+dfsg1-1.1
ii  libselinux1  3.4-1+b6

coreutils recommends no packages.

coreutils suggests no packages.

-- no debconf information


signature.asc
Description: PGP signature