from:"Pádraig Brady"

bug#73418: ls (GNU coreutils) 9.4 is extremely slower than ls (GNU coreutils) 8.32 listing files on a cifs mounted share

2024-11-09 Thread Pádraig Brady


On 03/10/2024 15:28, Pádraig Brady wrote:

On 03/10/2024 07:39, Paul Eggert wrote:

On 2024-10-02 15:14, Pádraig Brady wrote:


I notice that tests/ls/getxattr-speedup.sh is failing now,
which I've not looked into yet.


I'm not seeing that after the changes I committed this evening. If you
can still reproduce it please let us know the details.


I can still repro.
I'm on BTRFS though I don't think that matters for this test.
I suspect the caching is being bypassed as we don't return ENOTSUP
in all cases where the underlying lgetxattr() returns ENOTSUP.
Actually in cases where the caching in file_has_aclinfo_cache()
is effective, and file_has_aclinfo() is bypassed, we get UMR
due to the uninitialized aclinfo struct.
So the whole interaction with file_has_aclinfo_cache() needs looking at.


I can confirm that the following commit fixes both issues above:
https://github.com/coreutils/coreutils/commit/b857d66b5


I also see issues with symlink handling,
where security contexts aren't read for symlinks.
Comparing system ls, with latest:
$ ls -lZ INSTALL
lrwxrwxrwx. 1 padraig padraig unconfined_u:object_r:user_home_t:s0 ...

$ src/ls -lZ INSTALL
lrwxrwxrwx 1 padraig padraig ? ...


This is still an issue here.

thanks,
Pádraig

bug#74107: Spurious ".IP" macros in a few coreutils 9.5 man pages

2024-10-30 Thread Pádraig Brady


On 30/10/2024 14:14, Glenn Golden wrote:

A few man pages from coreutils 9.5 seem to have spurious ".IP" troff macros
in several places.  Noticed in tail.1, timeout.1.  Not present in cat.1, df.1,
du.1, ls.1, rm.1, tr.1. Did not check others.

See attached screenshots of rendered pages for tail and timeout. The extra
vertical spacings seem to be due to what appear to be spurious .IP macros.

Platform/version info:
Arch linux, coreutils package 9.5-2 (latest), commodity x86_64 laptop.


This comes from the manual indentation present in --help output
to help delineate option descriptions.  help2man will insert a .IP
when it sees such indentation.  I'm not sure how to easily address this.
We previously discussed more programmatic alignment in --help output,
which may implicitly solve the issue.

cheers,
Pádraig

bug#74106: Find Bug in split util

2024-10-30 Thread Pádraig Brady


tag 74106 notabug
close 74106
stop

On 30/10/2024 11:33, Andrey S wrote:

Hello!
I've used split util to devide a 220G file into files of 1G size. I've used
option '-d' like shown below.

*split* -b 1G -d big-file.zip big-file.zip.part_

After part number 89 I've got part number 9000 and it goes with 9001, 9002,
9003 etc. next.

This is unexpected.


It's done like this to aid recombination with shell sorting.
You can get a sequential set of numbers by using an appropriately large -a 
option.
See https://www.pixelbeat.org/docs/coreutils-gotchas.html#split

cheers,
Pádraig

bug#74103: Characters order in tr affects results

2024-10-30 Thread Pádraig Brady


tag 74103 notabug
close 74103
stop

On 30/10/2024 09:42, Jakub Filipiuk wrote:

Hi

I stumble upon situation, when characters order in tr affects it result.
For example:

$ echo "some: 123 fa-ncy string, " | tr -d ',-:'
some  fancy string

When colon is moved before hyphen, result is correct

$ echo "some: 123 fa-ncy string, " | tr -d ',:-'
some 123 fancy string


This is working as expected as the hyphen denotes a range of characters.
Placing a hyphen at the end is the most portable way to avoid this.
As a GNU extension, you can also backslash escape the hyphen like \-

cheers,
Pádraig

bug#73875: cp: there isn't even an option to quit on errors

2024-10-20 Thread Pádraig Brady


On 20/10/2024 15:21, Dan Jacobson wrote:

Well OK, it is creating an unusable copy, and probably doesn't
even return an error to the calling shell.

So Makefiles using it will go on to the next step... with the final
result being missing files, discovered just minutes away from the boss's
big ceremony...


Any error is propagated to the exit status,
so one can always determine if any part of the copy fails.

cheers,
Pádraig.

bug#73875: cp: there isn't even an option to quit on errors

2024-10-20 Thread Pádraig Brady


On 19/10/2024 08:45, Dan Jacobson wrote:

$ cp -a /usr/share/doc/qpdf/manual-html/ ~/Downloads/
cp: cannot create symbolic link 
'/home/jidanni/Downloads/manual-html/_static/css/badge_only.css': Permission 
denied
cp: cannot create symbolic link 
'/home/jidanni/Downloads/manual-html/_static/css/theme.css': Permission 
denied

It just keeps rolling on. No way to quit except ^C.

$ cp --version
cp (GNU coreutils) 9.4


Well POSIX states that cp should continue upon failure to create a symlink.
An exit early option may be useful.
I don't see such an option with rsync either though, so I'm not sure how 
generally useful it is.

cheers,
Pádraig

bug#73784: [PATCH] cp: new option --nocache-source

2024-10-13 Thread Pádraig Brady


On 13/10/2024 05:56, Masatake YAMATO wrote:

When copying files, the system data cache are consumed, the system
data cache is utilized for both the source and destination files. In
scenarios such as creating backup files for old, unused files, it is
clear to users that these files will not be needed in the near
future. In such cases, retaining the data for these files in the cache
constitutes a waste of computer resources, especially when running
applications that require significant memory in the foreground.

With the new option, users will have the ability to request the
discarding of the system data cache, thereby avoiding the unwanted
swapping out of data from foreground processes.

I evaluated cache consumption using a script called
run.bash. Initially, run.bash creates many small files, each 8 KB in
size. It then copies these files using the cp command, both with and
without the specified option. Finally, it reports the difference in
the total size of the caches before and after the copying process.

run.bash:

 #!/bin/bash
 CP=$1
 shift

 [[ -e "$CP" ]] || {
echo "no file found: $CP" 1>&2
exit 1
 }

 N=8
 S=drop-src
 D=${HOME}/drop-dst

 mkdir -p $S
 mkdir -p $D

 start=
 end=
 print_cached()
 {
grep ^Cached: /proc/meminfo
 }

 start()
 {
start=$(print_cached | awk '{print $2}')
 }

 end()
 {
end=$(print_cached | awk '{print $2}')
 }

 report()
 {
echo -n "delta[$N:$1/$2]: "
expr "$end" - "$start"
 }

 cleanup()
 {
local i
local j
for ((i = 0; i < 10; i++)); do
for ((j = 0; j < 10; j++)); do
rm -f $S/F-${i}${j}*
rm -f $D/F-${i}${j}*
done
done
rm -f $S/F-*
rm -f $D/F-*
 }

 prep()
 {
local i
for ((i = 0; i < 1024 * $N; i++ )); do
if ! dd if=/dev/zero of=$S/F-$i bs=4096 count=2 \
status=none; then
echo "failed in dd of=$S/F-$F" 1>&2
exit 1
fi
done
sync
 }

 run_cp()
 {
start

local i
time for ((i = 0; i < 1024 * $N; i++ )); do
if ! "${CP}" "$@" "$S/F-$i" "$D/F-$i"; then
echo "failed in cp " "$@" "$S/F-$i" " $D/F-$i" 1>&2
exit 1
fi
done

end
report "$1" $2
 }

 cleanup
 sync

 prep
 run_cp "$@"

running:

 ~/coreutils/nocache$  ./run.bash ../src/cp

 real   0m16.051s
 user   0m4.249s
 sys0m12.437s
 delta[8:/]: 65548
 ~/coreutils/nocache$  ./run.bash ../src/cp --nocache-source

 real   0m17.109s
 user   0m4.492s
 sys0m13.317s
 delta[8:--nocache-source/]: 620

--nocache-source option suppresses the consumption of the cache
massively.


Thanks for the patch.
I have some reservations/notes though...

There is nothing particularly special about cp, that it might need this option.
I.e. it would be nice to be able to wrap any program so that it streamed
data through the cache, rather than aggressively cached.  I'm not sure how to 
do that,
but also I'd be reluctant to start adding such options to individual commands 
though.
Perhaps Linux' open() may gain an O_STREAM flag in future that might be
more generally applied with a wrapper or something.

For single (large) files, one already has this functionality in dd.

On the write side, you'd also have to worry about syncing, to make the
drop cache advisory effective, and this could impact performance.

Might this drop caches for already cached files,
which cp may just happen to be copying,
thus potentially impacting performance for other programs.

If reflinking we probably would not want to do this operation,
since we're not reading the source.

thanks,
Pádraig

bug#73772: [PATCH] cp,mv: align the descriptions of long options

2024-10-12 Thread Pádraig Brady


On 12/10/2024 17:28, Masatake YAMATO wrote:

* src/cp.c (usage): Adjust white spaces for --update.
* src/mv.c (usage): Ditto.
---
  src/cp.c | 2 +-
  src/mv.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/cp.c b/src/cp.c
index 1c4fd5c20..127b5603f 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -227,7 +227,7 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
-T, --no-target-directorytreat DEST as a normal file\n\
  "), stdout);
fputs (_("\
-  --update[=UPDATE]control which existing files are updated;\n\
+  --update[=UPDATE]control which existing files are updated;\n\
   UPDATE={all,none,none-fail,older(default)}\n\
-u   equivalent to --update[=older].  See below\n\
  "), stdout);
diff --git a/src/mv.c b/src/mv.c
index 806cb3a11..490c99c23 100644
--- a/src/mv.c
+++ b/src/mv.c
@@ -294,7 +294,7 @@ If you specify more than one of -i, -f, -n, only the final 
one takes effect.\n\
-T, --no-target-directorytreat DEST as a normal file\n\
  "), stdout);
fputs (_("\
-  --update[=UPDATE]control which existing files are updated;\n\
+  --update[=UPDATE]control which existing files are updated;\n\
   UPDATE={all,none,none-fail,older(default)}\n\
-u   equivalent to --update[=older].  See below\n\
  "), stdout);


Pushed.

thanks!
Pádraig

bug#73418: ls (GNU coreutils) 9.4 is extremely slower than ls (GNU coreutils) 8.32 listing files on a cifs mounted share

2024-10-06 Thread Pádraig Brady


On 06/10/2024 18:53, Bernhard Voelker wrote:

On 10/3/24 08:04, Paul Eggert wrote:

I installed the attached (plus some other refactorings) to do that.


I don't see test-suite failures anymore, thanks.

  > [PATCH] ls: tune indicator_name

This change makes 'check-ls-dircolors' and therefore 'distcheck' fail.
The attached fixes the check.


+1

thanks!

Pádraig

bug#73418: ls (GNU coreutils) 9.4 is extremely slower than ls (GNU coreutils) 8.32 listing files on a cifs mounted share

2024-10-03 Thread Pádraig Brady


On 03/10/2024 07:39, Paul Eggert wrote:

On 2024-10-02 15:14, Pádraig Brady wrote:


I notice that tests/ls/getxattr-speedup.sh is failing now,
which I've not looked into yet.


I'm not seeing that after the changes I committed this evening. If you
can still reproduce it please let us know the details.


I can still repro.
I'm on BTRFS though I don't think that matters for this test.
I suspect the caching is being bypassed as we don't return ENOTSUP
in all cases where the underlying lgetxattr() returns ENOTSUP.
Actually in cases where the caching in file_has_aclinfo_cache()
is effective, and file_has_aclinfo() is bypassed, we get UMR
due to the uninitialized aclinfo struct.
So the whole interaction with file_has_aclinfo_cache() needs looking at.

If I remove the caching I also see issues with symlink handling,
where security contexts aren't read for symlinks.
Comparing system ls, with latest:
  $ ls -lZ INSTALL
  lrwxrwxrwx. 1 padraig padraig unconfined_u:object_r:user_home_t:s0 ...

  $ src/ls -lZ INSTALL
  lrwxrwxrwx 1 padraig padraig ? ...

I pushed two other fixes:

1. Since tests/ls/no-cap.sh is now always skipped.
I adjusted the test to avoid that.

2. ls -Z always output an error, so fixed up the
condition around the error() call in that case.

cheers,
Pádraig

bug#73418: ls (GNU coreutils) 9.4 is extremely slower than ls (GNU coreutils) 8.32 listing files on a cifs mounted share

2024-10-02 Thread Pádraig Brady


On 30/09/2024 06:24, Paul Eggert wrote:

Looking at the current coreutils source, I noticed that 'ls' called
getxattr when it didn't need to. I installed the attached patch to fix
some of the issue; more could be done and perhaps I'll find the time.
Among other things this patch should cause GNU ls to use llistxattr
instead of listxattr which may make a difference.


I notice that tests/ls/getxattr-speedup.sh is failing now,
which I've not looked into yet.

cheers,
Pádraig

bug#73418: ls (GNU coreutils) 9.4 is extremely slower than ls (GNU coreutils) 8.32 listing files on a cifs mounted share

2024-10-02 Thread Pádraig Brady


On 30/09/2024 09:26, Bernhard Voelker wrote:

On 9/30/24 7:24 AM, Paul Eggert wrote:

I installed the attached patch to fix some of the issue


There's a new test failure for a require_root_ test here:

FAIL: tests/ls/capability
=


That should be fixed by the attached.

thanks,
PádraigFrom 58df92259fac2a8613bbd8df03c61ef5c37dc436 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Wed, 2 Oct 2024 22:57:16 +0100
Subject: [PATCH] ls: reinstate capability checking in more cases

The recent commit v9.5-119-g4ce432ad8 restricted capability checking
to only files with XATTR_NAME_CAPS set.  If this is done then we need
to adjust tests/ls/no-cap.sh so that it doesn't always skip.  More
problematically XATTR_NAME_CAPS was only determined in long listing
mode, thus breaking capability coloring in short listing mode
as evidenced by the failing tests/ls/capability.sh test.

Note capability checking does have a large overhead, but we've
disabled capability checking by default anyway through the default
color configuration since v9.0-187-g6b5134770

So for these reasons revert to checking capabilities as before.

* src/ls.c (gobble_file): Check for capabilities in all modes
if enabled in color config.
---
 src/ls.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index 099893f86..8d0ae11f4 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -3484,6 +3484,12 @@ gobble_file (char const *name, enum filetype type, ino_t inode,
 
   f->stat_ok = true;
 
+  /* has_capability adds around 30% runtime to 'ls --color',
+  so call it only if really needed.  */
+  if ((type == normal || S_ISREG (f->stat.st_mode))
+  && print_with_color && is_colored (C_CAP))
+f->has_capability = has_capability_cache (full_name, f);
+
   if (format == long_format || print_scontext)
 {
   struct aclinfo ai;
@@ -3512,14 +3518,6 @@ gobble_file (char const *name, enum filetype type, ino_t inode,
 error (0, ai.scontext_err, "%s", quotef (full_name));
 }
 
-  /* has_capability adds around 30% runtime to 'ls --color',
- so call it only if really needed.  */
-  if (0 < ai.size
-  && (type == normal || S_ISREG (f->stat.st_mode))
-  && print_with_color && is_colored (C_CAP)
-  && aclinfo_has_xattr (&ai, XATTR_NAME_CAPS))
-f->has_capability = has_capability_cache (full_name, f);
-
   f->scontext = ai.scontext;
   ai.scontext = nullptr;
   aclinfo_free (&ai);
-- 
2.46.0

bug#73510: Questions on date command.

2024-09-27 Thread Pádraig Brady


On 27/09/2024 08:18, Benjamin Vargin via GNU coreutils Bug Reports wrote:

Hello,

First of all, I would like to thank you for all the works accomplished by your 
teams !!

I'm currently implementing functions in bash which are using the command "date".
I have noticed something strange when I realize manipulation during the 
daylight saving from Winter to Summer times in Europe.

Let me know, if it's the good place to submit my question or where I could do 
it.

In France, this year (2024), we have changed time during the night between the 
30th and 31st of March.
At 2.00 AM (31st march) -- it was --> 3.00 AM (31st march)

So if I do the date operation 30th march 2.30AM + 1 day, I'm expecting 
something like 31st march 3.30AM.

Observation using containers: bash - date (GNU coreutils) 9.5
$ docker run --rm -it docker.io/bash:5.2.32
$ apk update
$ apk add --no-cache coreutils tzdata
$ date --version
date (GNU coreutils) 9.5
$ export TZ="Europe/Paris"
$ date -d "2024-03-30T02:30:00.123+0100 + 1 days" "+%FT%T.%3N%z"
2024-03-31T01:30:00.123+0100

$ date -d "2024-03-30T02:30:00.123+0100 + 24 hours" "+%FT%T.%3N%z"
2024-03-31T03:30:00.123+0200

# Minus looks fine
$ date -d "2024-03-31T03:30:00.123+0200 - 1 days" "+%FT%T.%3N%z"

2024-03-30T02:30:00.123+0100

$ date -d "2024-03-31T03:30:00.123+0200 - 24 hours" "+%FT%T.%3N%z"
2024-03-30T02:30:00.123+0100

The logic behind this is that the 31st is a day with 23 hours only, so 1 day 
was considered as 23 hours even if we didn't include all this day.
$ date -d "2024-03-30T02:30:00.123+0100 + 23 hours" "+%FT%T.%3N%z"
2024-03-31T01:30:00.123+0100

This behavior could be understood when we are taking a more simple case (that's 
the current result):
2024-03-30T13:30:00.123+0100 + 1 day => 2024-03-30T14:30:00.123+0200
But I understand less when we fall in the time shift window.

Question 1: Should 3.30AM be the good result to the operation 
"2024-03-30T02:30:00.123+0100 + 1 days" ?

In addition, I have tested on other virtual environments (sorry I don't have a 
native one :( ) and results are again differents.
OS | Core Utils | +1 day | +24 hours |
docker.io/bash:5.2.26-alpine3.19 | 9.4 | 2024-03-31T01:30:00.123+ | 
2024-03-31T03:30:00.123+0200 |
VM Ubuntu 24.04 | 9.4 | 2024-03-31T04:30:00.123+0200 | 
2024-03-31T03:30:00.123+0200 |

Debian 12.7 | 9.1 | 2024-03-31T04:30:00.123+0200 | 2024-03-31T03:30:00.123+0200 
|
Git Bash | 8.32 | 2024-03-31T01:30:00.123+ | 2024-03-31T01:30:00.123+ |

Question 2: Is there something else than Core Utils which could have an effect 
on what the command date returns ?

The Winter daylight saving looks fine:

$ date -d "2023-10-28T03:30:00.123+0200 +1 days" "+%FT%T.%3N%z"
2023-10-29T02:30:00.123+0100
$ date -d "2023-10-28T03:30:00.123+0200 +24 hours" "+%FT%T.%3N%z"
2023-10-29T02:30:00.123+0100

$ date -d "2023-10-29T03:30:00.123+0100 -1 days" "+%FT%T.%3N%z"
2023-10-28T04:30:00.123+0200
$ date -d "2023-10-29T03:30:00.123+0100 -24 hours" "+%FT%T.%3N%z"
2023-10-28T04:30:00.123+0200

May, I misleading and I did'nt used the good format or others, let me know if 
it's the case.

Thanks in advance for your feedbacks.
Best Regards.


This is a common gotcha which is discussed at:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e

In summary currently one needs to anchor the relative adjustment
to it midpoint of the period being adjusted.
So for a +1 day adjustment, you would need to anchor to 12:00.

We may change things in future so the midpoint anchor is automatically chosen.

cheers,
Pádraig

bug#73474: Bug in factor utility of coreutils

2024-09-25 Thread Pádraig Brady


On 25/09/2024 20:55, Paul Eggert wrote:

On 2024-09-25 08:14, Pádraig Brady wrote:

It seems better to move the checks within the macro to avoid any future
issues.
I've also applied the same check to rsh2 in the attached.


That patch doesn't suffice (nor does Artem's) because the count can be
negative. I'll take a further look at this.


Indeed. I had already added asserts just in case, and hit:

  $ src/factor 79228162514264337593543941441  factor: src/factor.c:427: mod2: 
Assertion `0 < cnt' failed.
  Aborted (core dumped)

Note factor does give the correct result on amd64 at least
without the asserts:

  $ factor 79228162514264337593543941441 | sed 's/.*: //; s/ /*/g' | bc
  79228162514264337593543941441

cheers,
Pádraig

bug#73474: Bug in factor utility of coreutils

2024-09-25 Thread Pádraig Brady


On 25/09/2024 11:26, Артем Насонов wrote:

To whom it may concern,

I’m writing to let you know that I found an issue by fuzzing in
coreutils in *factor* utility and want to report it. Here are some details:

1. Host architecture: Host it Debian x86_64 architecture
2. factor version: factor (GNU coreutils) 9.5.94-5cecd
3. Affected code area: src/factor.c:425
4. Steps to reproduce:
      Working on commit: 5cecd703e57b2e1301767d82cbe5bb01cae88472

|    export CC="clang-17"
      export CXX="clang++-17"
      export CFLAGS="-fsanitize=address,undefined -g"
      export LDFLAGS="-fsanitize=address,undefined -g"
      export
UBSAN_OPTIONS=halt_on_error=1,abort_on_error=1,print_stacktrace=true,symbolize=true,print_stacktrace=1,report_error_type=1,symbolize=1
      ./bootstrap
      ./configure
      make
      ./src/factor 22022|

5. Bug details: during fuzzing with Undefined-Behaviour sanitizer we've
got the following report:

|//src/factor.c:425:3: runtime error: shift exponent 64 is too large for
64-bit type 'uintmax_t' (aka 'unsigned long')
      #0 0x56332975f862 in mod2
/home/artemiin/Work/crash_confirmation/coreutils/src/factor.c:425:3
      #1 0x56332975ae54 in factor_using_pollard_rho2
coreutils/src/factor.c:1665:12
      #2 0x563329750ab5 in factor  coreutils/src/factor.c:2246:9
      #3 0x56332974eed6 in print_factors_single coreutils/src/factor.c:2454:3
      #4 0x56332974dd4c in print_factors coreutils/src/factor.c:2506:11
      #5 0x56332974d20d in main  coreutils/src/factor.c:2647:15
      #6 0x7fa1933eb249 in __libc_start_call_main
csu/../sysdeps/nptl/libc_start_call_main.h:58:16
      #7 0x7fa1933eb304 in __libc_start_main csu/../csu/libc-start.c:360:3
      #8 0x563329673630 in _start ( coreutils/src/factor+0x59630)
(BuildId: 28b6f8b912cd8e99b886145a922476ec873a438b)/
/|
      During analysis of this report, I found that there are some
numbers, like 22022, which fall into the
function mod2 and produce cnt=0.
      I suppose that you expect lsh2(number, 0) == number (shift by zero
should not change the number). But inside the realization of that macro,
with cnt=0 we have an operation (d0) >> (64 - (cnt)) which stands for
d0>>64. This is generally undefined behavior - on some systems, the
number d0 remains unchanged (but is expected to be 0) and on other
systems it can be zero.
      So, generally, the result of this operation is undefined. Although
the result is correct for the number 22022, it
may not be true for other numbers or architectures.

6. Patch suggestion: I suggest just not to call lsh2 when cnt=0 to avoid
this bug. My patch is in the attachment.

Waiting for your reply,
Best regards,
Nasonov Artem


Nice one!
It seems better to move the checks within the macro to avoid any future issues.
I've also applied the same check to rsh2 in the attached.

I'll apply the attached later.

Marking this as done.

thanks!
Pádraigdiff --git a/src/factor.c b/src/factor.c
index 2649e9fc6..802c05e4f 100644
--- a/src/factor.c
+++ b/src/factor.c
@@ -318,16 +318,16 @@ static void factor (uintmax_t, uintmax_t, struct factors *);
 #endif
 
 #define rsh2(rh, rl, ah, al, cnt)   \
-  do {  \
+  if (cnt) {\
 (rl) = ((ah) << (W_TYPE_SIZE - (cnt))) | ((al) >> (cnt));   \
 (rh) = (ah) >> (cnt);   \
-  } while (0)
+  } else (void) 0
 
 #define lsh2(rh, rl, ah, al, cnt)   \
-  do {  \
+  if (cnt) {\
 (rh) = ((ah) << cnt) | ((al) >> (W_TYPE_SIZE - (cnt))); \
 (rl) = (al) << (cnt);   \
-  } while (0)
+  } else (void) 0
 
 #define ge2(ah, al, bh, bl) \
   ((ah) > (bh) || ((ah) == (bh) && (al) >= (bl)))

bug#72914: Issues in man pages of GNU Coreutils

2024-09-23 Thread Pádraig Brady

On 31/08/2024 12:19, Helge Kreutzmann wrote:

Dear Coreutils maintainer,
the manpage-l10n project maintains a large number of translations of
man pages both from a large variety of sources (including coreutils) as
well for a large variety of target languages.

During their work translators notice different possible issues in the
original (english) man pages. Sometimes this is a straightforward
typo, sometimes a hard to read sentence, sometimes this is a
convention not held up and sometimes we simply do not understand the
original.

We use several distributions as sources and update regularly (at
least every 2 month). This means we are fairly recent (some
distributions like archlinux also update frequently) but might miss
the latest upstream version once in a while, so the error might be
already fixed. We apologize and ask you to close the issue immediately
if this should be the case, but given the huge volume of projects and
the very limited number of volunteers we are not able to double check
each and every issue.

Secondly we translators see the manpages in the neutral po format,
i.e. converted and harmonized, but not the original source (be it man,
groff, xml or other). So we cannot provide a true patch (where
possible), but only an approximation which you need to convert into
your source format.

Finally the issues I'm reporting have accumulated over time and are
not always discovered by me, so sometimes my description of the
problem my be a bit limited - do not hesitate to ask so we can clarify
them.

I'm now reporting the errors for your project. If future reports
should use another channel, please let me know.

Man page: chgrp.1
Issue:values → values.

"use RFILE's ownership rather than specifying values RFILE is always"
"dereferenced if a symbolic link."

> --
> Man page: chown.1
> Issue:values → values.
>
> "use RFILE's ownership rather than specifying values RFILE is always"
> "dereferenced if a symbolic link."
> --

Above already fixed (but not yet released) in:
https://github.com/coreutils/coreutils/commit/e8c2b921c

--
Man page: chgrp.1
Issue:'-P' → B<-P>

"The following options modify how a hierarchy is traversed when the B<-R>"
"option is also specified.  If more than one is specified, only the final one"
"takes effect. '-P' is the default."
--
Man page: chmod.1
Issue:'-H' → B<-H>

"The following options modify how a hierarchy is traversed when the B<-R>"
"option is also specified.  If more than one is specified, only the final one"
"takes effect. '-H' is the default."
--
Man page: chown.1
Issue:'-P' → B<-P>

"The following options modify how a hierarchy is traversed when the B<-R>"
"option is also specified.  If more than one is specified, only the final one"
"takes effect. '-P' is the default."

Fixed in the just pushed:
https://github.com/coreutils/coreutils/commit/afab48f23

--
Man page: echo.1
Issue:'printf' → B(1)

"Consider using the 'printf' command instead, as it avoids problems when"
"outputting option-like strings."

Fixed in the just pushed:
https://github.com/coreutils/coreutils/commit/7337076ec

--
Man page: env.1
Issue:Missing line break before "usage"

"-S/--split-string usage in scripts"

I think this is OK, however,
the confusion probably stems from the section header [OPTIONS].
I've adjusted this to the more descriptive: [SCRIPT OPTION HANDLING],
and also removed some overly detailed info from the man page.
https://github.com/coreutils/coreutils/commit/4da7daa01

--
Man page: expand.1
Issue:   Missing full stop

"use comma separated list of tab positions.  The last specified position can"
"be prefixed with '/' to specify a tab size to use after the last explicitly"
"specified tab stop.  Also a prefix of '+' can be used to align remaining tab"
"stops relative to the last specified tab stop instead of the first column"
--
Man page: unexpand.1
Issue:Missing full stop

"use comma separated list of tab positions.  The last specified position can"
"be prefixed with '/' to specify a tab size to use after the last explicitly"
"specified tab stop.  Also a prefix of '+' can be used to align remaining tab"
"stops relative to the last specified tab stop instead of the first column"

These follows the pattern of the last sentence in an option description
_not_ having a full stop.  There were a few cases where we weren't
consistent with that pattern actually, which are now fixed up in:
https://github.com/coreutils/coreutils/commit/ac5213acb

> --
> Man page: sort.1.po
> Issue:"manual" is very vague, this is the manual? Be more precise, e.g. 
B(1)
>
> "compare according to string numerical value; see manual for which strings"
> "are supported"

Fixed in the just pushed:
https://github.com/coreutils/coreutils/commit/9e5274cd1be

Marking this as done.

Many thanks!

Pádraig.

bug#73418: ls (GNU coreutils) 9.4 is extremely slower than ls (GNU coreutils) 8.32 listing files on a cifs mounted share

2024-09-22 Thread Pádraig Brady


On 22/09/2024 07:21, Gian Domenico Bonazzoli wrote:

Hi all,

During an upgrade from Ubuntu 22.04 to Ubuntu 24.04 we discovered that "ls
(GNU coreutils) 9.4" is extremely slower than "ls (GNU coreutils) 8.32"
listing files on a cifs mounted share.

Facts to help you in troubleshooting the problem.

In the same Ubuntu 24.04 machine I have the vanilla executable in
/usr/bin/ls and a copy of Ubuntu 22.04 in /tmp/ls

root@kube-node04:~# /usr/bin/ls --version
ls (GNU coreutils) 9.4

root@kube-node04:~# /tmp/ls --version
ls (GNU coreutils) 8.32

I have also a cifs share mounted by:

//172.16.0.100/export /cifs/colom/export cifs
vers=2.0,uid=5,username=teledba,password=* 0 1

This is what happens:

root@kube-node04:~# time /tmp/ls -l  /cifs/colom/export >/dev/null
real 0m0.232s
user 0m0.027s
sys 0m0.027s


root@kube-node04:~# time /usr/bin/ls -l  /cifs/colom/export >/dev/null
real 0m5.744s
user 0m0.068s
sys 0m0.349s

So "ls -l" in 9.4 is 50 time slower then "ls -l" in 8.32

If I launch the ls command without the -l parameter the computing times are
instead quite similar:

root@kube-node04:~# time /tmp/ls  /cifs/colom/export >/dev/null
real 0m0.198s
user 0m0.015s
sys 0m0.021s

root@kube-node04:~# time /usr/bin/ls  /cifs/colom/export >/dev/null
real 0m0.174s
user 0m0.017s
sys 0m0.016s

looking at the strace output I saw that the new version is doing for each
file processed the adjunctive call listxattr (in my case the directory
contains 14500 files):

...
statx(AT_FDCWD, "/cifs/colom/export/ARTP000.CSV",
AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT,
STATX_MODE|STATX_NLINK|STATX_UID|STATX_GID|STATX_MTIME|STATX_SIZE,
{stx_mask=STATX_TYPE|STATX_MODE|STATX_NLINK|STATX_UID|STATX_GID|STATX_MTIME|STATX_CTIME|STATX_INO|STATX_SIZE|STATX_BLOCKS|STATX_BTIME|STATX_MNT_ID,
stx_attributes=0, stx_mode=S_IFREG|0755, stx_size=840, ...}) = 0
listxattr("/cifs/colom/export/ARTP000.CSV", "", 152) = 0
...

maybe it is the root cause of the elapsed time when the ls command had the
"-l" option fired ?

Regards


You listxattr() above doesn't seem to be returning an error,
but perhaps this is another manifestation of
https://bugs.gnu.org/68283 ?

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-13 Thread Pádraig Brady


On 13/09/2024 13:55, Pádraig Brady wrote:

On 12/09/2024 20:33, Paul Eggert wrote:

On 2024-09-12 12:03, Pádraig Brady wrote:



This is tricky enough, that we should be as restrictive as possible here,
so I may resort to strspn(f, "0123456789") to parse instead.
I'll think a bit about it.


The code's also assuming INT_MAX < INTMAX_MAX, which POSIX doesn't
require. You could put in a static_assert to that effect, I suppose, to
document the assumption.


Indeed. We would have incorrectly taken the INTMAX_MAX arg in the
(albeit unlikely) case where INT_MAX >= INTMAX_MAX,
and the provided number overflowed INTMAX_MAX.
To be explicit, strtol() doesn't return 0 in that case,
so we need to check overflow (like Colin suggested).


More important, though, if you're not in the C locale all bets are off
as far as what strtoimax will also parse.

When I ran into this problem with GNU tar, I ended by giving up on
strtoimax and did my own little integer parser. It does exactly what I
want and I don't have to fire up the strtoimax complexity+locale engine.


Right, it's best to preparse for the above reason,
and to avoid any confusion re leading spaces etc. like I previously mentioned.

The attached adjustment does the preparse with strspn(),
and only does the strtoimax() for appropriate strings.


I adjusted and pushed a further simplification that
clamps %$ to %INT_MAX$, which is equivalent
as argc can practically only be <= INT_MAX - 2.
That simplifies the strtoimax() error handling,
and removes any limits on valid decimal numbers.

cheers,
Pádraig

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-13 Thread Pádraig Brady


On 12/09/2024 20:33, Paul Eggert wrote:

On 2024-09-12 12:03, Pádraig Brady wrote:



This is tricky enough, that we should be as restrictive as possible here,
so I may resort to strspn(f, "0123456789") to parse instead.
I'll think a bit about it.


The code's also assuming INT_MAX < INTMAX_MAX, which POSIX doesn't
require. You could put in a static_assert to that effect, I suppose, to
document the assumption.


Indeed. We would have incorrectly taken the INTMAX_MAX arg in the
(albeit unlikely) case where INT_MAX >= INTMAX_MAX,
and the provided number overflowed INTMAX_MAX.
To be explicit, strtol() doesn't return 0 in that case,
so we need to check overflow (like Colin suggested).


More important, though, if you're not in the C locale all bets are off
as far as what strtoimax will also parse.

When I ran into this problem with GNU tar, I ended by giving up on
strtoimax and did my own little integer parser. It does exactly what I
want and I don't have to fire up the strtoimax complexity+locale engine.


Right, it's best to preparse for the above reason,
and to avoid any confusion re leading spaces etc. like I previously mentioned.

The attached adjustment does the preparse with strspn(),
and only does the strtoimax() for appropriate strings.

cheers,
Pádraigdiff --git a/src/printf.c b/src/printf.c
index 37844f14a..442935705 100644
--- a/src/printf.c
+++ b/src/printf.c
@@ -454,14 +454,23 @@ print_formatted (char const *format, int argc, char **argv)
 
 #define GET_CURR_ARG(POS)\
 do {			\
-  char *arge;		\
-  intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);	\
-  if (0 < arg && arg <= INT_MAX && *arge == '$')	\
+  intmax_t arg = 0;	\
+  size_t argl;		\
+  /* Check with strspn() first to avoid spaces etc.	\
+ This also avoids any locale ambiguities.  */	\
+  if (POS != 3 && (argl = strspn (f, "0123456789"))	\
+  && f[argl] == '$')\
+{			\
+  errno = 0;	\
+  arg = strtoimax (f, NULL, 10);			\
+  if (errno)	\
+arg = 0;	\
+}			\
+  if (1 <= arg && arg <= INT_MAX)			\
 /* Process indexed %i$ format.  */			\
-/* Note '$' comes before any flags.  */		\
 {			\
   SET_CURR_ARG (arg - 1);\
-  f = arge + 1;	\
+  f += argl + 1;	\
   if (POS == 0)	\
 direc_arg = arg - 1;\
 }			\
diff --git a/tests/printf/printf-indexed.sh b/tests/printf/printf-indexed.sh
index d0f799ab1..5d63d68f8 100755
--- a/tests/printf/printf-indexed.sh
+++ b/tests/printf/printf-indexed.sh
@@ -75,19 +75,28 @@ printf_check '   1' '%2$*1$d\n' 4 1
 # Indexed arg, and sequential width
 printf_check '   1' '%2$*d\n' 4 1
 
-# Flags come after $ (0 is not a flag here):
+# Flags come after $ (0 is not a flag here but allowed):
 printf_check '   1' '%01$4d\n' 1
 # Flags come after $ (0 is a flag here):
 printf_check '0001' '%1$0*2$d\n' 1 4
 # Flags come after $ (-2 not taken as a valid index here):
 printf_check_err 'printf: %-2$: invalid conversion specification' \
  '%-2$s %1$s\n' A B
+# Flags come after $ (' ' is not allowed as part of number here)
+printf_check_err 'printf: % 2$: invalid conversion specification' \
+ '% 2$s %1$s\n' A B
 
 # Ensure only base 10 numbers are accepted
 printf_check_err "printf: 'A': expected a numeric value" \
  '%0x2$s %2$s\n' A B
+# Ensure empty numbers are rejected
+printf_check_err 'printf: %$: invalid conversion specification' \
+ '%$d\n' 1
 # Verify int limits (avoiding comparisons with argc etc.)
 printf_check_err "printf: %${INT_OFLOW}\$: invalid conversion specification" \
  "%${INT_OFLOW}\$d\n" 1
+# Verify intmax limits, ensuring overflow detected
+printf_check_err "printf: %${INTMAX_OFLOW}\$: invalid conversion specification" \
+ "%${INTMAX_OFLOW}\$d\n" 1
 
 Exit $fail

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-12 Thread Pádraig Brady


On 12/09/2024 18:40, Collin Funk wrote:

Hi Pádraig,

Pádraig Brady  writes:


I'll apply the attached sometime tomorrow.

Marking this as done.


Patch looks good, thanks.

One small comment, though.


+#define GET_CURR_ARG(POS)  \
+do {   \
+  char *arge;  \
+  intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);\
+  if (0 < arg && arg <= INT_MAX && *arge == '$') \
+/* Process indexed %i$ format.  */ \
+/* Note '$' comes before any flags.  */\


Shouldn't you check errno here, like:

   char *arge;
   errno = 0;
   intmax_t arg = POS==3 ? 0 : strtoimax (f, &arge, 10);
   if (errno == 0 && 0 < arg && arg <= INT_MAX && *arge == '$')
   [...]

I think that would handle all bad cases.

For example, I think "%$" might return 0 but set errno to EINVAL.


A fair point, but note we only accept 1 ... INT_MAX,
so that implicitly excludes any of the possible error returns.
I should at least add a comment.

Though it got me thinking that strtol() may be too lenient
in what it accepts, resulting in possible confusion for users.
For example some printf flags like ' ' or '+' would be accepted
as part of a number, when ideally they should not be.

For example, the user might do:

$ printf '[% 1$d]\n' 1234
[1234]

When they really intended:

$ printf '[%1$ d]\n' 1234
[ 1234]

This is tricky enough, that we should be as restrictive as possible here,
so I may resort to strspn(f, "0123456789") to parse instead.
I'll think a bit about it.

thanks!
Pádraig

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-12 Thread Pádraig Brady


On 12/09/2024 18:06, Bruno Haible wrote:

Pádraig Brady wrote:

I'll apply the attached sometime tomorrow.


Nice! Thank you.

There seems to be a typo in the unit test, though: It defines a shell
function 'printf_checki_err' but the function it then invokes is
'printf_check_err'.


Hah, good catch.
That hid other errors in the test.
Fixed up with the attached.

thanks for the review,
Pádraigdiff --git a/tests/printf/printf-indexed.sh b/tests/printf/printf-indexed.sh
index 1c3a6c380..d0f799ab1 100755
--- a/tests/printf/printf-indexed.sh
+++ b/tests/printf/printf-indexed.sh
@@ -34,14 +34,14 @@ EOF
   compare exp out || fail=1
 }
 
-printf_checki_err() {
+printf_check_err() {
   cat < exp || framework_failure_
 $1
 EOF
 
   shift
 
-  returns_1 $prog "$@" 2> out || fail=1
+  returns_ 1 $prog "$@" 2> out || fail=1
   compare exp out || fail=1
 }
 
@@ -85,9 +85,9 @@ printf_check_err 'printf: %-2$: invalid conversion specification' \
 
 # Ensure only base 10 numbers are accepted
 printf_check_err "printf: 'A': expected a numeric value" \
- '%0x2$s %1$s\n' A B
+ '%0x2$s %2$s\n' A B
 # Verify int limits (avoiding comparisons with argc etc.)
-printf_check_err "printf: %${INT_OFLOW}$: invalid conversion specification" \
- "%${INT_OFLOW}$d\n" 1
+printf_check_err "printf: %${INT_OFLOW}\$: invalid conversion specification" \
+ "%${INT_OFLOW}\$d\n" 1
 
 Exit $fail

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-12 Thread Pádraig Brady


On 09/09/2024 19:30, Pádraig Brady wrote:

On 06/09/2024 15:06, Bruno Haible wrote:

Hi,

POSIX:2024 specifies that printf(1) should support numbered conversion
specifications:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html
https://austingroupbugs.net/view.php?id=1592

Could this support please be added to GNU coreutils? As of coreutils 9.5,
I still get:

$ /usr/bin/printf 'abc%2$sdef%1$sxxx\n' 1 2
abc/usr/bin/printf: %2$: invalid conversion specification



This make sense to implement.
I see ksh and FreeBSD at least, already have.
I'll have a look at doing this.


I'll apply the attached sometime tomorrow.

Marking this as done.

cheers,
PádraigFrom 97e55c7ace9e9e46a32faa0d592870983a14367b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Wed, 11 Sep 2024 16:07:48 +0100
Subject: [PATCH] printf: add indexed argument support

* src/printf.c (print_formatted): Add support for %i$ indexed args.
* tests/printf/printf-indexed.sh: Add a new file of test cases.
* tests/local.mk: Reference the new test file.
* doc/coreutils.texi (printf invocation): Mention how mixed
processing of indexed and sequential references are supported,
unlike the printf(2) library function.
* NEWS: Mention the new feature.
These are specified in POSIX:2024.
Addresses https://bugs.gnu.org/73068
---
 NEWS   |   4 +
 doc/coreutils.texi |   6 ++
 src/printf.c   | 179 -
 tests/local.mk |   1 +
 tests/printf/printf-indexed.sh |  93 +
 5 files changed, 215 insertions(+), 68 deletions(-)
 create mode 100755 tests/printf/printf-indexed.sh

diff --git a/NEWS b/NEWS
index e1d3f82d1..6094de8d2 100644
--- a/NEWS
+++ b/NEWS
@@ -40,6 +40,10 @@ GNU coreutils NEWS-*- outline -*-
   ls now supports the --sort=name option,
   to explicitly select the default operation of sorting by file name.
 
+  printf now supports indexed arguments, using the POSIX:2024 specified
+  %i$ format, where 'i' is an integer referencing a particular argument,
+  thus allowing repetition or reordering of printf arguments.
+
 ** Improvements
 
   'head -c NUM', 'head -n NUM', 'nl -l NUM', 'nproc --ignore NUM',
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 58b425779..9fe953587 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -13429,6 +13429,12 @@ Missing @var{argument}s are treated as null strings or as zeros,
 depending on whether the context expects a string or a number.  For
 example, the command @samp{printf %sx%d} prints @samp{x0}.
 
+@item
+Indexed arguments referenced with @samp{%i$} formats, can be
+mixed with standard sequential argument references,
+in which case both index types are independent.
+For example, the command @samp{printf '%1$s%s' A} prints @samp{AA},
+
 @item
 @kindex \c
 An additional escape, @samp{\c}, causes @command{printf} to produce no
diff --git a/src/printf.c b/src/printf.c
index de3507925..b1c7dd561 100644
--- a/src/printf.c
+++ b/src/printf.c
@@ -291,15 +291,13 @@ print_esc_string (char const *str)
 }
 
 /* Evaluate a printf conversion specification.  START is the start of
-   the directive, LENGTH is its length, and CONVERSION specifies the
-   type of conversion.  LENGTH does not include any length modifier or
-   the conversion specifier itself.  FIELD_WIDTH and PRECISION are the
-   field width and precision for '*' values, if HAVE_FIELD_WIDTH and
-   HAVE_PRECISION are true, respectively.  ARGUMENT is the argument to
-   be formatted.  */
+   the directive, and CONVERSION specifies the type of conversion.
+   FIELD_WIDTH and PRECISION are the field width and precision for '*'
+   values, if HAVE_FIELD_WIDTH and HAVE_PRECISION are true, respectively.
+   ARGUMENT is the argument to be formatted.  */
 
 static void
-print_direc (char const *start, size_t length, char conversion,
+print_direc (char const *start, char conversion,
  bool have_field_width, int field_width,
  bool have_precision, int precision,
  char const *argument)
@@ -333,6 +331,7 @@ print_direc (char const *start, size_t length, char conversion,
 break;
   }
 
+size_t length = strlen (start);
 p = xmalloc (length + length_modifier_len + 2);
 q = mempcpy (p, start, length);
 q = mempcpy (q, length_modifier, length_modifier_len);
@@ -448,50 +447,92 @@ print_direc (char const *start, size_t length, char conversion,
 static int
 print_formatted (char const *format, int argc, char **argv)
 {
-  int save_argc = argc;		/* Preserve original value.  */
+
+/* Set curr_arg from indexed %i$ or otherwise next in sequence.
+   POS can be 0,1,2,3 corresponding to
+   [%][width][.precision][conversion] respectively.  */
+
+#define GET_CURR_ARG(POS)\
+do {			\
+  char *arge;		\

bug#73194: ls command converts utf-8 character into escape sequences

2024-09-12 Thread Pádraig Brady


On 12/09/2024 11:16, Simon Wolfe wrote:

I have one file name that uses Unicode character U+318DF, which is in the 
tertiary pane, more precisely CJK Unified Ideographs Extension H.

touch 𱣟
ls

returns:

''$'\360\261\243\237'

Extension H was introduced in Unicode 15.0 in 2022.

I also notice that this bug occurs with any character with Extension I 
(introduced in 2023).

Extension G seems to works okay.


ls 9.4 works as expected for me with glibc-2.39 in a UTF-8 locale.
I.e. that file is displayed directly.
Now if I set the locale to non UTF-8 it will display the form above
(which works on all locales BTW).

  $ touch ''$'\360\261\243\237'
  $ ls ''$'\360\261\243\237'
  𱣟
  $ LC_ALL=C ls ''$'\360\261\243\237'
  ''$'\360\261\243\237'

So I suspect your system libs are not updated to recognize this character,
hence the fallback format is used.

cheers,
Pádraig.

bug#73068: printf: please implement POSIX:2024 argument reordering

2024-09-09 Thread Pádraig Brady


On 06/09/2024 15:06, Bruno Haible wrote:

Hi,

POSIX:2024 specifies that printf(1) should support numbered conversion
specifications:
https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html
https://austingroupbugs.net/view.php?id=1592

Could this support please be added to GNU coreutils? As of coreutils 9.5,
I still get:

   $ /usr/bin/printf 'abc%2$sdef%1$sxxx\n' 1 2
   abc/usr/bin/printf: %2$: invalid conversion specification



This make sense to implement.
I see ksh and FreeBSD at least, already have.
I'll have a look at doing this.

thank you,
Pádraig

bug#72842: Compile error due to lto warning on Arch Linux -Werror=maybe-uninitialized

2024-08-28 Thread Pádraig Brady


On 27/08/2024 22:34, Paul Eggert wrote:

I'm not seeing that false alarm when building coreutils v9.5 on Ubuntu
24.04 LTS with "./configure --enable-gcc-warnings CC=gcc-14".

What GCC version are you using? If it's not GCC 14, try upgrading. I'm
using gcc-14 (Ubuntu 14-20240412-0ubuntu1) 14.0.1 20240412
(experimental) [master r14-9935-g67e1433a94f].


I think LTO is key here, as GCC has more context then to produce warnings.
With `-march=native -O3 -flto` on GCC 14.2.1 I see,
2 false warnings in gnulib (patch attached).
1 correct warning in coreutils (patch attached).

Marking this as done.
I'll apply the patches later today.

cheers,
Pádraig.From 264edcf263bff02aebfe67f1d7343fbf0e96b5de Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Wed, 28 Aug 2024 12:10:43 +0100
Subject: [PATCH] avoid GCC -Wmaybe-uninitialized false positives with LTO

Avoids false warnings with GCC 14.2.1 with -flto

* lib/canonicalize.c: Initialize END_IDX.
* lib/getndelim2.c: Initicalise C.
---
 ChangeLog  | 8 
 lib/canonicalize.c | 9 -
 lib/getndelim2.c   | 8 +---
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8a7f812b67..fdcc79134e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2024-08-28  Pádraig Brady  
+
+	avoid GCC -Wmaybe-uninitialized false positives with LTO
+	Avoids false warnings with GCC 14.2.1 with -flto
+
+	* lib/canonicalize.c: Initialize END_IDX.
+	* lib/getndelim2.c: Initicalise C.
+
 2024-08-28  Bruno Haible  
 
 	doc: Add more details about O_EXEC and O_SEARCH.
diff --git a/lib/canonicalize.c b/lib/canonicalize.c
index 7d2a629024..2572b40558 100644
--- a/lib/canonicalize.c
+++ b/lib/canonicalize.c
@@ -34,6 +34,13 @@
 #include "hash-triple.h"
 #include "xalloc.h"
 
+/* Suppress bogus GCC -Wmaybe-uninitialized warnings.  */
+#if defined GCC_LINT || defined lint
+# define IF_LINT(Code) Code
+#else
+# define IF_LINT(Code) /* empty */
+#endif
+
 #ifndef DOUBLE_SLASH_IS_DISTINCT_ROOT
 # define DOUBLE_SLASH_IS_DISTINCT_ROOT false
 #endif
@@ -367,7 +374,7 @@ canonicalize_filename_mode_stk (const char *name, canonicalize_mode_t can_mode,
   buf[n] = '\0';
 
   char *extra_buf = bufs->extra.data;
-  idx_t end_idx;
+  idx_t end_idx IF_LINT (= 0);
   if (end_in_extra_buffer)
 end_idx = end - extra_buf;
   size_t len = strlen (end);
diff --git a/lib/getndelim2.c b/lib/getndelim2.c
index 89989aefdd..db61e2a5e6 100644
--- a/lib/getndelim2.c
+++ b/lib/getndelim2.c
@@ -47,8 +47,10 @@
 #include "memchr2.h"
 
 /* Avoid false GCC warning "'c' may be used uninitialized".  */
-#if _GL_GNUC_PREREQ (4, 7)
-# pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#if defined GCC_LINT || defined lint
+# define IF_LINT(Code) Code
+#else
+# define IF_LINT(Code) /* empty */
 #endif
 
 /* The maximum value that getndelim2 can return without suffering from
@@ -102,7 +104,7 @@ getndelim2 (char **lineptr, size_t *linesize, size_t offset, size_t nmax,
   /* Here always ptr + size == read_pos + nbytes_avail.
  Also nbytes_avail > 0 || size < nmax.  */
 
-  int c;
+  int c IF_LINT (= EOF);
   const char *buffer;
   size_t buffer_len;
 
-- 
2.46.0

From 9b9763e6a7499d78373fb42bac3dc2ee68a14b69 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Wed, 28 Aug 2024 12:33:17 +0100
Subject: [PATCH] all: fix error checking in gl/lib/xdectoint.c

This issue was noticed with -flto on GCC 14.2.1

* gl/lib/xdectoint.c (__xnumtoint): Only inspect the
returned value if LONGINT_INVALID is not set,
as the returned value is uninitialized in that case.
Fixes https://bugs.gnu.org/72842
---
 gl/lib/xdectoint.c | 37 -
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/gl/lib/xdectoint.c b/gl/lib/xdectoint.c
index bb38431f3..05534701c 100644
--- a/gl/lib/xdectoint.c
+++ b/gl/lib/xdectoint.c
@@ -49,24 +49,27 @@ __xnumtoint (char const *n_str, int base, __xdectoint_t min, __xdectoint_t max,
   /* Errno value to report if there is an overflow.  */
   int overflow_errno;
 
-  if (tnum < min)
+  if (s_err != LONGINT_INVALID)
 {
-  r = min;
-  overflow_errno = flags & XTOINT_MIN_RANGE ? ERANGE : EOVERFLOW;
-  if (s_err == LONGINT_OK)
-s_err = LONGINT_OVERFLOW;
-}
-  else if (max < tnum)
-{
-  r = max;
-  overflow_errno = flags & XTOINT_MAX_RANGE ? ERANGE : EOVERFLOW;
-  if (s_err == LONGINT_OK)
-s_err = LONGINT_OVERFLOW;
-}
-  else
-{
-  r = tnum;
-  overflow_errno = EOVERFLOW;
+  if (tnum < min)
+{
+  r = min;
+  overflow_errno = flags & XTOINT_MIN_RANGE ? ERANGE : EOVERFLOW;
+  if (s_err == LONGINT_OK)
+s_err = LONGINT_OVERFLOW;
+}
+

bug#72770: stat: filesystem ID regression in 9.5 with musl libc

2024-08-23 Thread Pádraig Brady


On 23/08/2024 12:39, Natanael Copa wrote:

On Fri, 23 Aug 2024 11:41:28 +0100
Pádraig Brady  wrote:


On 23/08/2024 08:14, Natanael Copa wrote:

With coreutils 9.5 stat -f -c %i returns wrong value with musl libc.

$ docker run --rm -it -v /tmp:/tmp alpine:3.20 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.5
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89
4dc59e89680d9b7c

Coreutils returns 4dc59e89 instead of the expected 4dc59e89680d9b7c.

With 9.4 this works as expected:

$ docker run --rm -it -v /tmp:/tmp alpine:3.19 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.4
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89680d9b7c
4dc59e89680d9b7c

Downstream report: https://gitlab.alpinelinux.org/alpine/aports/-/issues/16386


I can't see what might have changed on the coreutils side with a quick look.
It would help eliminate cases if you provided strace output from 9.4 and 9.5


I think the difference is that with 9.4 it ends up using statfs(2), and
with 9.5 it ends up using statvfs(3).

musl only uses 32 bits of the returned 64 bits from statfs(2) to match the 
POSIX interface.

https://git.musl-libc.org/cgit/musl/tree/src/stat/statvfs.c?id=dd1e63c3638d5f9afb857fccf6ce1415ca5f1b8b#n39

I believe they do so to have consistent fsid on 32 bit and 64 bit architectures.


Interesting. I see musl has been doing this truncation since 2011,
but I guess the coreutils build picked up statvfs() since musl
started returning f_type in it:
https://git.musl-libc.org/cgit/musl/commit/src/stat/statvfs.c?id=7291c6c6
I suppose we could adjust m4/stat-prog.m4 in coreutils to
avoid statvfs detection on musl

It would be worth suggesting to the musl folks to avoid this truncation.

thanks,
Pádraig

bug#72770: stat: filesystem ID regression in 9.5 with musl libc

2024-08-23 Thread Pádraig Brady


On 23/08/2024 08:14, Natanael Copa wrote:

With coreutils 9.5 stat -f -c %i returns wrong value with musl libc.

$ docker run --rm -it -v /tmp:/tmp alpine:3.20 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.5
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later .
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89
4dc59e89680d9b7c

Coreutils returns 4dc59e89 instead of the expected 4dc59e89680d9b7c.

With 9.4 this works as expected:

$ docker run --rm -it -v /tmp:/tmp alpine:3.19 sh -c "apk add --quiet coreutils && stat --version 
&& stat -f -c %i /tmp && busybox stat -f -c %i /tmp"
stat (GNU coreutils) 9.4
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later .
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Michael Meskes.
4dc59e89680d9b7c
4dc59e89680d9b7c

Downstream report: https://gitlab.alpinelinux.org/alpine/aports/-/issues/16386


I can't see what might have changed on the coreutils side with a quick look.
It would help eliminate cases if you provided strace output from 9.4 and 9.5

thanks,
Pádraig

bug#72707: install: need_copy (for "-C") should use stat instead of lstat

2024-08-19 Thread Pádraig Brady


On 19/08/2024 03:44, Frank Heckenbach wrote:

install dereferences symlinks given as sources (and I think that's
good). The "-C" option, however, uses lstat rather than
stat in need_copy and returns true if !S_ISREG.

So the (dereferenced) file will always be copied despite "-C".

Even worse, since "-p" cannot be combined with "-C", the installed
file will always get a new timestamp which may trigger rebuilds
(e.g. if the file is a header) even if nothing was changed, e.g.:

% touch a
% ln -s a b
% install -C b c
% ls --time-style=+%T -l
total 0
-rw--- 1 frank frank 0 23:46:51 a
lrwxrwxrwx 1 frank frank 1 23:46:54 b -> a
-rwxr-xr-x 1 frank frank 0 23:46:58 c
% install -C b c
% ls --time-style=+%T -l
total 0
-rw--- 1 frank frank 0 23:46:51 a
lrwxrwxrwx 1 frank frank 1 23:46:54 b -> a
-rwxr-xr-x 1 frank frank 0 23:47:06 c
% install -C b c
% ls --time-style=+%T -l
total 0
-rw--- 1 frank frank 0 23:46:51 a
lrwxrwxrwx 1 frank frank 1 23:46:54 b -> a
-rwxr-xr-x 1 frank frank 0 23:47:08 c
%

I asked Kamil Dudka who implemented "-C" originally and he pointed
out that requiring both the files to be regular files was suggested
by Jim Meyering in the review:
https://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00106.html

However, I don't see a real reason there, only his question:
"Have you considered requiring that both files be `regular', too?"

I don't know if install didn't dereference symlinks back then, but
now that it does, it seems consistent to dereference them in the
check as well, i.e. use stat rather than lstat.


I agree with your arguments to change this.

I'm not sure if the suggestion above about regular files,
was meant to preclude symlinks.

Marking this as done,
and I'll apply the attached later.

thanks,
PádraigFrom 692ae740939cb356d7a16cb1e17d959e68d1fa91 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 19 Aug 2024 12:43:09 +0100
Subject: [PATCH] install: dereference source symlinks when comparing

* NEWS: Mention the change in behavior.
* src/install.c (need_copy): s/lstat/stat/ for the source.
* tests/install/install-C.sh: Add test cases
(and improve existing test case which wan't valid
due to the existing non standard modes on test files).
Addresses https://bugs.gnu.org/72707
---
 NEWS   |  3 +++
 src/install.c  |  2 +-
 tests/install/install-C.sh | 12 +++-
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 65cc3fde0..e1d3f82d1 100644
--- a/NEWS
+++ b/NEWS
@@ -24,6 +24,9 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Changes in behavior
 
+  install -C now dereferences symlink sources when comparing,
+  rather than always treating as different and performing the copy.
+
   ls's -f option now simply acts like -aU, instead of also ignoring
   some earlier options.  For example 'ls -fl' and 'ls -lf' are now
   equivalent because -f no longer ignores an earlier -l.  The new
diff --git a/src/install.c b/src/install.c
index 939db9cd4..85f2d9e11 100644
--- a/src/install.c
+++ b/src/install.c
@@ -177,7 +177,7 @@ need_copy (char const *src_name, char const *dest_name,
 return true;
 
   /* compare files using stat */
-  if (lstat (src_name, &src_sb) != 0)
+  if (stat (src_name, &src_sb) != 0)
 return true;
 
   if (fstatat (dest_dirfd, dest_relname, &dest_sb, AT_SYMLINK_NOFOLLOW) != 0)
diff --git a/tests/install/install-C.sh b/tests/install/install-C.sh
index a6b332e14..ee3ee2467 100755
--- a/tests/install/install-C.sh
+++ b/tests/install/install-C.sh
@@ -79,8 +79,18 @@ ginstall -Cv -m$mode3 a b > out || fail=1
 compare out out_installed_second || fail=1
 
 # files are not regular files
+ginstall -v -m$mode1 a b > out || fail=1  # reset to regular mode
+compare out out_installed_second || fail=1
+# symlink source is always dereferenced (and so regular here)
+ginstall -v -m$mode1 a d > out || fail=1  # create regular dest
+echo "'a' -> 'd'" > out_installed_first_ad || framework_failure_
+compare out out_installed_first_ad || fail=1
 ln -s a c || framework_failure_
-ln -s b d || framework_failure_
+ginstall -Cv -m$mode1 c d > out || fail=1
+compare out out_empty || fail=1
+# symlink (non regular) dest is always replaced
+ln -nsf b d || framework_failure_
+ls -l a b c d > /tmp/pb.out
 ginstall -Cv -m$mode1 c d > out || fail=1
 echo "removed 'd'
 'c' -> 'd'" > out_installed_second_cd
-- 
2.45.2

bug#72617: sort -n loses lines.

2024-08-14 Thread Pádraig Brady


On 14/08/2024 11:04, Simon B wrote:

Hi Pádraig

I am largely satisfied by your great explanation,

I am still confused why lines go "missing" though.




Even if the dot is being interpreted, it still should not lose the
line containing 172.169.6.164


This is actually well explained in the online documentation,
so I won't repeat here.  See the --unique description at:
https://www.gnu.org/software/coreutils/sort

cheers,
Pádraig

bug#72617: sort -n loses lines.

2024-08-14 Thread Pádraig Brady


tag 72617 notabug
close 72617
stop

On 14/08/2024 09:43, Simon B wrote:

Hallo,

The output of my grep command is:

# grep -i "sshd" /root/access.report | egrep -o
  
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
64.227.127.122
172.169.5.249
172.169.6.164


Adding the --debug option shows the issue.
I.e. the '.' being considered as part of a number:

$ sort --debug -rbn -s ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: note numbers use ‘.’ as a decimal point in this locale
178.128.44.128
___
172.169.6.164
___
172.169.5.249
___


Taking the example for sorting IPv4 addresses from the manual,
shows the desired comparisons being performed:

$ sort --debug -t '.' -k 1,1rn -k 2,2rn -k 3,3rn -k 4,4rn -u ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: numbers use ‘.’ as a decimal point in this locale
178.128.44.128
___
___
__
   ___
172.169.6.164
___
___
_
  ___
172.169.5.249
___
___
_
  ___


cheers,
Pádraig

bug#72567: printf: Incorrect description of "%b"

2024-08-11 Thread Pádraig Brady


forcemerge 72567 72568
close 72567
stop

On 11/08/2024 04:05, Keith Thompson wrote:

There are three different descriptions of the printf(1) "%b" format:

- printf --help
- man printf.1
- info coreutils printf

As of release 9.5 and the latest version in git (Thu 2024-08-08
c5725c8c4), the first two incorrectly say that octal escapes are
of the form \0 or \0NNN. In fact the \0 can be followed by 0 to 3
octal digits.


Right. The main point there really is that a leading 0 is the exception.
(A leading 0 is required by ksh for example, but optional elsewhere).
I've pushed a change along those lines at:
https://github.com/coreutils/coreutils/commit/296cc3ed9

thanks,
Pádraig

bug#72446: [PATCH] stat: Fix memleak

2024-08-03 Thread Pádraig Brady


tag 72446 notabug
close 72446
stop

On 03/08/2024 17:10, Dmitry Chestnykh wrote:

format and format2 strings are allocated
by `malloc()` inside `xasprintf` so the memory
should be freed

* src/stat.c: Call `free()` on `format` and `format2`
---
  src/stat.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/stat.c b/src/stat.c
index 1513abfaa..47f3b5052 100644
--- a/src/stat.c
+++ b/src/stat.c
@@ -1974,5 +1974,7 @@ main (int argc, char *argv[])
 ? do_statfs (argv[i], format)
 : do_stat (argv[i], format, format2));
  
+  free(format);

+  free(format2);
main_exit (ok ? EXIT_SUCCESS : EXIT_FAILURE);
  }


Hi,

There are a could of problems with this.
1. It's redundant to free just before exit
2. It's invalid to free this memory with the -c format specified.

valgrind currently shows no leaks with or without -c
(it does show reachable blocks, but that's fine).

thanks,
Pádraig

bug#72232: "make dist" is not reproducible

2024-07-21 Thread Pádraig Brady


On 21/07/2024 18:58, Paul Eggert wrote:

On 2024-07-21 10:44, Pádraig Brady wrote:

We can just rely on the timestamp of the .tarball-version
to support reproducible _tarballs_.


Although this makes the distributed file contents reproducible, the
tarballs themselves are still not reproducible, since they contain
random timestamps and (I imagine) other metadata. I vaguely recall Bruno
having a general solution for that sort of problem but don't recall
where it got put. (I do something more complicated in TZDB.)


Fair enough.  That issue is discussed at:
https://www.gnu.org/software/tar/manual/html_section/Reproducibility.html

For now at least I'll adjust the summary
to say "tarball contents" rather than "tarballs".

thanks,
Pádraig

bug#72232: "make dist" is not reproducible

2024-07-21 Thread Pádraig Brady


On 21/07/2024 17:18, Bruno Haible wrote:

Hi,

Subsequent runs of "make dist" in the same environment produce tarballs with
different contents.

How to reproduce:

In a git checkout of coreutils, do:

$ echo snapshot > .tarball-version
$ ./configure; make -k maintainer-clean
$ ./bootstrap
$ ./configure; make dist
$ mkdir 1; (cd 1 && tar xf ../*.tar.gz)
$ make; make dist
$ mkdir 2; (cd 2 && tar xf ../*.tar.gz)
$ diff -r -q 1 2

The last command produces a difference:

$ diff -r -q 1 2
Files 1/coreutils-snapshot/.timestamp and 2/coreutils-snapshot/.timestamp differ
$ cat 1/coreutils-snapshot/.timestamp
1721577765
$ cat 2/coreutils-snapshot/.timestamp
1721577832

Can this be avoided? Can the contents of '.timestamp' always be the same?
(Wouldn't it be enough to give it a different modification time, each time?)


Right, we should be able to adjust this.
The .timestamp file was added to support reproducible _builds_:
https://github.com/coreutils/coreutils/commit/c1b3d6587
https://reproducible-builds.org/docs/source-date-epoch/

We can just rely on the timestamp of the .tarball-version
to support reproducible _tarballs_.
That's done in the attached, which I'll apply later.

Marking this as done.

thanks,
PádraigFrom ed1273b00b127cad9bb9867e0fd2be70d60bee1a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 21 Jul 2024 18:25:05 +0100
Subject: [PATCH] build: support reproducible tarball creation

We already support reproducible builds since commit v8.24-99-gc1b3d6587,
and this adjusts that change to also support reproducible tarballs with
subsequent runs of `make dist`.

* Makefile.am: Don't create a varying .timestamp file, instead ...
* man/local.mk: Rely on the timestamp of the .tarball-version file.
Fixes https://bugs.gnu.org/72232
---
 Makefile.am  | 2 --
 man/local.mk | 3 ++-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/Makefile.am b/Makefile.am
index bbbdc7829..3fd546599 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -82,14 +82,12 @@ BUILT_SOURCES = .version
 
 # Have no read-only files in the tarball to allow easy removal.
 # Have .tarball-version based versions only in tarball builds.
-# Have .timestamp based dates only in tarball builds.
 # The perl substitution is to change some key uses of "rm" to "/bin/rm".
 # See the rm_subst comment for details.
 # The touch avoids a subtle, spurious "make distcheck" failure.
 dist-hook: gen-ChangeLog
 	$(AM_V_GEN)chmod -R +rw $(distdir)
 	$(AM_V_GEN)echo $(VERSION) > $(distdir)/.tarball-version
-	$(AM_V_GEN)date +%s > $(distdir)/.timestamp
 	$(AM_V_at)perl -pi -e '$(rm_subst)' $(distdir)/Makefile.in
 	$(AM_V_at)touch $(distdir)/doc/constants.texi \
 	  $(distdir)/doc/coreutils.info
diff --git a/man/local.mk b/man/local.mk
index 24f88745a..1acd73fe7 100644
--- a/man/local.mk
+++ b/man/local.mk
@@ -194,7 +194,8 @@ endif
 	  && $(MKDIR_P) $$t		\
 	  && (cd $$t && $(LN_S) '$(abs_top_builddir)/src/'$$prog$(EXEEXT) \
 $$argv$(EXEEXT))			\
-	&& : $${SOURCE_DATE_EPOCH=`cat $(srcdir)/.timestamp 2>/dev/null || :`} \
+	&& : $${SOURCE_DATE_EPOCH=`date -r $(srcdir)/.tarball-version +%s \
+   2>/dev/null || :`} \
 	&& : $${TZ=UTC0} && export TZ	\
 	&& export SOURCE_DATE_EPOCH && $(run_help2man)			\
 		 --source='$(PACKAGE_STRING)'			\
-- 
2.45.2

bug#72024: documentation suggestion: make old versions of docs easily available

2024-07-10 Thread Pádraig Brady


On 09/07/2024 22:18, mark.yagnatinsky--- via GNU coreutils Bug Reports wrote:

If go to the website of say, Perl or Python, and I see there's a cool function I want to 
use, then often there will be a note "new in version X.Y".
Then I know that if I need to support versions older than that, I can't use it, 
and otherwise I can.
If there is no such note, then I also have the option of easily pulling up old 
versions of the docs.
So I can quickly grab the docs for the oldest version I care about, and see if 
the function exists there.

Most GNU software (e.g., bash, coreutils, etc.) makes the latest version easily 
available on the website, but not older versions.
(The older versions are indeed available, but not require downloading and 
unzipping the full source distribution, or a similar level of annoyance.)
This is sometimes fine, since if I need to support some ancient version, then 
presumably I have access to at least one machine where that version is 
installed, and that machine will often have docs installed too.
But it's still nice to know whether I'm "skating close to the edge" or whether 
I'm using stuff that was already supported last century.

Just my two cents.
Mark.


This is a fair point.

Now there should be less interface churn from version to version,
than a larger API surface like python etc., but it's still a valid point.
Currently one way to glean this info would be from the NEWS file:
https://raw.githubusercontent.com/coreutils/coreutils/master/NEWS
But that is awkward to search, and you're not sure did you miss
an item, or that it was always present.

Presenting this version info inline in the docs would be
too invasive I think, given it's not usually required,
but presenting versions to select would be a useful service.
Along the lines of the FreeBSD project for example:
https://man.freebsd.org/cgi/man.cgi?query=timeout

Now the above is more useful as it's a complete distro,
which suggests it may be more appropriate for each distro
to provide complete versioned online man pages.
Debian does in fact do this with selectable distro versions:
https://manpages.debian.org/

Actually I now see a distro agnostic versioned man page repo,
which would be the most general way to determine portability constraints:
https://manned.org/

cheers,
Pádraig.

bug#72012: [PATCH] id: Warn if user in more groups than `id` can reliably print

2024-07-10 Thread Pádraig Brady


tag 72012 notabug
close 72012
stop

On 10/07/2024 00:31, Paul Eggert wrote:

On 7/10/24 00:35, Pádraig Brady wrote:

OK so id(1) will always show all groups it knows about.
Then the warning message might be along the lines of:

    "warning: User '%s' is a member of more groups than the current
system limit"


I also am not seeing the point of the proposed diagnostic. I daresay
most users would be more annoyed than usefully warned by the diagnostic;
I know I would.

The rare user concerned about being in "too many" groups can run
'getconf NGROUPS_MAX' and 'id -G | wc -w' and compare.


Right.
Given that id can display all the groups,
it's not its responsibility to display potential limits from elsewhere.

cheers,
Pádraig

bug#72012: [PATCH] id: Warn if user in more groups than `id` can reliably print

2024-07-09 Thread Pádraig Brady


On 09/07/2024 21:15, Otto Kekäläinen wrote:

Hi!

The point is just to emit a warning when this happens. Sure it is rare but the 
fix is pretty safe to apply.


OK so id(1) will always show all groups it knows about.
Then the warning message might be along the lines of:

  "warning: User '%s' is a member of more groups than the current system limit"

For reference I see a summary of limits of various systems at:
https://www.j3e.de/ngroups.html

cheers,
Pádraig

bug#72012: [PATCH] id: Warn if user in more groups than `id` can reliably print

2024-07-09 Thread Pádraig Brady


On 09/07/2024 05:22, Otto Kekäläinen wrote:

While rare, it is possible for a user to be a member in more groups than
what the system limit allows (on Linux typically NGROUPS_MAX=65536) and
if that is the case, running `id` or `id user` will not print all of
them. This is a minor bug, but easily fixable by emitting a warning if
it happens.
---
  src/id.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/id.c b/src/id.c
index 38d5517bd..c572b2d99 100644
--- a/src/id.c
+++ b/src/id.c
@@ -401,6 +401,13 @@ print_full_info (char const *username)
  ok &= false;
  return;
}
+else if (sysconf(_SC_NGROUPS_MAX) > 0 && n_groups > 
sysconf(_SC_NGROUPS_MAX))
+  {
+fprintf (stderr,
+ _("Warning: User '%s' may be member of more groups than "\
+   "the system allows\n"),
+ (username != NULL) ? username : "");
+  }
  
  if (n_groups > 0)

fputs (_(" groups="), stdout);


I'm a bit confused with this patch.
If the n_groups is larger than NGROUPS_MAX what consequence will it have?
I.e. is there any point to id(1) warning about this edge case?
id will be able to show all of the n_groups in this case right?
I interpret NGROUPS_MAX as a static limit(ation),
which more dynamic interfaces (like getgrouplist) are not constrained to.

cheers,
Pádraig

bug#71986: RFC: date @ to support ms.

2024-07-08 Thread Pádraig Brady


On 07/07/2024 20:46, Richard Neill wrote:

Hello,

I've noticed a lot of systems now return the timestamp in milliseconds
since the epoch, rather than seconds. This means that e.g.

date --date='@1720378861258'

will do something rather unexpected!

May I suggest that it would be nice if date had an input format that
would let me specify that the value is in ms?

I know we can bodge it with bc, or by injecting the decimal, or trimming
off the last 3 chars, but that seems inelegant, and requires extra
thinking (and hence bugs) from the programmer.
date --date='@1720378861.258'

Perhaps one of these syntaxes might be suitable?

   date --date='@ms1720378861258'
   date --date='@@1720378861258'
   date --epoch-ms --date='@1720378861258'


Yes this has some merit, but given we can leverage numfmt
to convert / round, I'm not sure it's warranted.  Consider for e.g.:

  $ ms2date() { date --date=@$(numfmt --to-unit=1K --round=nearest "$1"); }  $ 
ms2date 1720378861999
  Sun 07 Jul 2024 20:01:02 IST

cheers,
Pádraig

bug#71895: Test failure in Coreutils 9.4 [During LFS Build]

2024-07-02 Thread Pádraig Brady


On 02/07/2024 02:05, 0xn00dle via GNU coreutils Bug Reports wrote:

Hello,

While running the test for Coreutils during my LFS build (Chapter 8.57 
Coreutils-9.4: 
https://www.linuxfromscratch.org/lfs/view/12.1/chapter08/coreutils.html), I 
encountered a potential bug and the test recommended I report it.

LFS 12.1 Build, on Pop_OS! Host System
Coreutils Verson: 9.4
Kernel: 6.8.0-76060800daily20240311-generic
Architecture: x86_64

It occurred after running:

su tester -c "PATH=$PATH make RUN_EXPENSIVE_TESTS=yes check"

The fail details:

FAIL: tests/tty/tty
===




--
tty: extra operand 'a'
Try 'tty --help' for more information.
+ returns 2 tty -s a
tty: extra operand 'a'
Try 'tty --help' for more information.
+ test -w /dev/full
+ test -c /dev/full
+ test -t 0
+ returns 3 tty
tty: write error: No space left on device
+ returns 3 tty
tty: write error: No space left on device
+ Exit 1
+ set +e
+ exit 1
+ exit 1
+ removetmp
+ _st=1
+ cleanup
+ :
+ test '' = yes
+ cd /sources/coreutils-9.4
+ chmod -R u+rwx /sources/coreutils-9.4/gt-tty.sh.xcFx
+ rm -rf /sources/coreutils-9.4/gt-tty.sh.xcFx
+ exit 1FAIL tests/tty/tty.sh (exit status: 1)

I applied coreutils-9.4-il8n-1.patch and a fix for a vulnerability in the split 
utility prior to the test.

Best,

Noelle



I can't repro.
Can you repro this test in isolation with:

  su tester -c 'make check VERBOSE=yes TESTS=tests/tty/tty.sh SUBDIRS=.'

Note some of the output of the test seemed to be elided,
so I wasn't able to see exactly the line that failed
(No space left on device is an expected case and not the issue here).

thanks,
Pádraig

bug#71803: ls --time=mtime is sorting by name instead of mtime

2024-06-27 Thread Pádraig Brady


On 27/06/2024 16:05, Dave wrote:

The ls command without the -l option and with the --time=mtime option,
is incorrectly sorting by the name rather than by the modification
time.

ls   # sorts by name (ok)
ls --time=mtime  # sorts by name (should sort by mtime)



// The current statement in ls.c (lines 2383-2387)
   sort_type = (0 <= sort_opt ? sort_opt
: (format != long_format
   && (time_type == time_ctime || time_type == time_atime
   || time_type == time_btime))
? sort_time : sort_name);

// Proposed correction (untested)
   sort_type = (0 <= sort_opt ? sort_opt
: (format != long_format
   && (time_type == time_ctime || time_type == time_atime
   || time_type == time_btime || time_type == time_mtime))
? sort_time : sort_name);


Right, we should be applying this GNU extension to --time=mtime also.
I.e. sorting when not displaying time (-l not specified),
and no other sorting specific option is used.

The proposed fix wouldn't work as time_mtime is the default,
so we'd be sorting by mtime rather than name by default.

I'll apply the attached later to address this.

Marking this as done.

thanks,
PádraigFrom b95c6ac7d20084d9167bdb9d759bcebcb0c2b766 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Thu, 27 Jun 2024 18:15:02 +0100
Subject: [PATCH] ls: treat --time=mtime consistently with other time selectors

* src/ls.c: Track if --time=mtime is explicitly specified,
so that we can apply the GNU extension of sorting by the
specified time, when not displaying (-l not specified),
and not explicitly sorting (-t not specified).
* tests/ls/ls-time.sh: Add / Update test cases.
Fixes https://bugs.gnu.org/71803
---
 src/ls.c| 18 --
 tests/ls/ls-time.sh | 30 --
 2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index 5e2f7c3e7..f7ffe1086 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -468,6 +468,7 @@ enum time_type
   };
 
 static enum time_type time_type;
+static bool explicit_time;
 
 /* The file characteristic to sort by.  Controlled by -t, -S, -U, -X, -v.
The values of each item of this enum are important since they are
@@ -1943,6 +1944,7 @@ decode_switches (int argc, char **argv)
 
 case 'c':
   time_type = time_ctime;
+  explicit_time = true;
   break;
 
 case 'd':
@@ -2017,6 +2019,7 @@ decode_switches (int argc, char **argv)
 
 case 'u':
   time_type = time_atime;
+  explicit_time = true;
   break;
 
 case 'v':
@@ -2146,6 +2149,7 @@ decode_switches (int argc, char **argv)
 
 case TIME_OPTION:
   time_type = XARGMATCH ("--time", optarg, time_args, time_types);
+  explicit_time = true;
   break;
 
 case FORMAT_OPTION:
@@ -2372,18 +2376,12 @@ decode_switches (int argc, char **argv)
   if (eolbyte < dired)
 error (LS_FAILURE, 0, _("--dired and --zero are incompatible"));
 
-  /* If -c or -u is specified and not -l (or any other option that implies -l),
- and no sort-type was specified, then sort by the ctime (-c) or atime (-u).
- The behavior of ls when using either -c or -u but with neither -l nor -t
- appears to be unspecified by POSIX.  So, with GNU ls, '-u' alone means
- sort by atime (this is the one that's not specified by the POSIX spec),
- -lu means show atime and sort by name, -lut means show atime and sort
- by atime.  */
+  /* If a time type is explicitly specified (with -c, -u, or --time=)
+ and we're not showing a time (-l not specified), then sort by that time,
+ rather than by name.  Note this behavior is unspecified by POSIX.  */
 
   sort_type = (0 <= sort_opt ? sort_opt
-   : (format != long_format
-  && (time_type == time_ctime || time_type == time_atime
-  || time_type == time_btime))
+   : (format != long_format && explicit_time)
? sort_time : sort_name);
 
   if (format == long_format)
diff --git a/tests/ls/ls-time.sh b/tests/ls/ls-time.sh
index f27a5dbbe..be8a83556 100755
--- a/tests/ls/ls-time.sh
+++ b/tests/ls/ls-time.sh
@@ -33,11 +33,15 @@ u2='1998-01-14 12:00'
 u3='1998-01-14 13:00'
 
 touch -m -d "$t3" a || framework_failure_
-touch -m -d "$t2" b || framework_failure_
+touch -m -d "$t2" B || framework_failure_  # Capital to distinguish name sort
 touch -m -d "$t1" c || framework_failure_
 
+# Check default name sorting works
+set $(ls a B c)
+test "$*" = 'B a c' || fail=1
+
 touch -a -d "$u3" c || framework_failure_
-touch -a -d "$u2" b || framework_failure_
+touch -a -d "$u2" B || framework_failure_
 # Make sure A has ctime at least 1 second more recent than C's.
 sleep 2
 touch -a -d "$u1" a || framework_failure_
@@ -47,7 +51,9 @@ touch -a -d "$u1" a || framework_failure_
 
 
 # A has ctime more recent

bug#71242: PR: src/stat.c - add magic for bcachefs

2024-05-28 Thread Pádraig Brady


On 28/05/2024 12:08, Paul Hedderly via GNU coreutils Bug Reports wrote:

Currently stat does not know bcachefs

$:~/gits/coreutils$ sudo stat -f -c %T /
UNKNOWN (0xca451a4e)

But it just needs to know that magic

$:~/gits/coreutils$ sudo ./src/stat -f -c %T /
bcachefs


Please pull this small commit:

https://github.com/coreutils/coreutils/commit/6da16c99ff0c6ec35943dd2db5f51c25efb8b508


I pushed the following before receiving this bug report:
https://github.com/coreutils/coreutils/commit/3ce31e6f1

Marking this as done.

thanks,
Pádraig

bug#71191: od does unnecessary reads instead of seeking

2024-05-25 Thread Pádraig Brady


On 25/05/2024 08:05, Joseph C. Sible wrote:

Consider running the following command, for looking at the pagemap
bits of a given memory page:

od -t x8 -N 8 -j 34359738368 /proc/PID/pagemap

That file supports seeking, but od will unnecessarily read and discard
32GB worth of data instead of doing so.

Looking at the skip function in od.c, it looks like this happens
because the file has a bogus st_size of 0 (as is typical for files in
/proc) despite usable_st_size returning true for it, which results in
the calls to fseeko never even being tried.


This is related to https://bugs.gnu.org/36291
Note the workaround with dd mentioned there ¹.
In that report I detailed four cases of /proc /sys and /dev files
that od needs to handle, and suggested we try the lseek() for
the non regular file case (where we don't worry about seeking past EOF).

As an aside we can actually _seek_ in /proc and /sys files
¹ dd used to report an error when seeking in such files,
when in fact seeking is actually supported. The warning was removed
(for all empty files) in https://github.com/coreutils/coreutils/commit/ac6b8d822
(as part of https://bugs.gnu.org/70231).
But this case is more complicated by the fact that od supports combining files,
so you need to avoid seeking beyond EOF.
I.e. `od --skip-bytes 1 empty nonempty` would give different results
if we blindly tried an lseek() if st_stize==0

I wonder could you efficiently seek to offset, avoiding seeking past EOF,
while ignoring st_size with something like:
  if (lseek(N-1))
if read(1)
  skip -= N
else
  lseek(-(N-1))
I'm not sure how general that would be.

cheers,
Pádraig

bug#71186: email address for patch

2024-05-25 Thread Pádraig Brady


tab 71186 notabug
close 71186
stop

On 24/05/2024 19:15, Данила Скачедубов wrote:

Hi! I have a few questions. Please tell me, I want to compile the main
files so that they are output by the man command. As far as I
understand, this is done by the help2man-1.48.5 command. But at the
same time, I can't compile correctly, let's say chown.x > chown.1, What
could be the problem? And the second question: I carefully studied the
documentation on the rules for preparing patches and it said that I
could get the address by running --help, but I did not see the email
address. Maybe it's him- [1]coreut...@gnu.org ? Thank you very much in
advance for the answers!

--
Danila Skachedubov

References

1. https://e.mail.ru/compose?To=coreut...@gnu.org


Proposed bug fixes go to bug-gnu...@gnu.org
General discussions go to coreut...@gnu.org

Usage for help2man is:

  help2man chown -i chown.x > chown.1

cheers,
Pádraig

bug#71029: single-binary build broken on systems with separate libintl

2024-05-21 Thread Pádraig Brady


On 18/05/2024 18:09, Audrey Dutcher wrote:

I presume you have the same issue with coreutils 9.4 ?


Correct.


I presume the problematic flags are propagated through LIBINTL or MBRTOWC_LIB. 
What are those set to for reference in your Makefile?


On plain FreeBSD it is set to /usr/local/lib/libintl.so -Wl,-rpath
-Wl,/usr/local/lib. Within a nix jail on FreeBSD (which is where I
encountered the issue first) it is simply -lintl.

Thanks for getting back to me!
- Audrey


Could you test with the attached,
which should reenable automake protections in this area.

cheers,
PádraigFrom 5691ff399e824c79090f2c7f3d08218fde4f9560 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Tue, 21 May 2024 13:08:45 +0100
Subject: [PATCH] build: fix build failure in --enable-single-binary mode

* src/local.mk: Avoid overriding automake generated DEPENDENCIES,
so that it applies its adjustments to LDADD to avoid propagating
flags (like -Wl,-rpath) into make targets.  This was seen on FreeBSD
where LIBINTL is set to:
/usr/local/lib/libintl.so -Wl,-rpath -Wl,/usr/local/lib
Instead let automake generate a sanitized src_coreutils_DEPENDENCIES
(based on LDADD), which we then augment with the EXTRA_... variable.
---
 src/local.mk | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/local.mk b/src/local.mk
index 3356f8a2a..8133925ac 100644
--- a/src/local.mk
+++ b/src/local.mk
@@ -482,13 +482,13 @@ if SINGLE_BINARY
 src_coreutils_CFLAGS = -DSINGLE_BINARY $(AM_CFLAGS)
 #src_coreutils_LDFLAGS = $(AM_LDFLAGS)
 src_coreutils_LDADD = $(single_binary_deps) $(LDADD) $(single_binary_libs)
-src_coreutils_DEPENDENCIES = $(LDADD) $(single_binary_deps)
+EXTRA_src_coreutils_DEPENDENCIES = $(single_binary_deps)
 
 include $(top_srcdir)/src/single-binary.mk
 
 # Creates symlinks or shebangs to the installed programs when building
 # coreutils single binary.
-EXTRA_src_coreutils_DEPENDENCIES = src/coreutils_$(single_binary_install_type)
+EXTRA_src_coreutils_DEPENDENCIES += src/coreutils_$(single_binary_install_type)
 endif SINGLE_BINARY
 
 CLEANFILES += src/coreutils_symlinks
-- 
2.44.0

bug#71029: single-binary build broken on systems with separate libintl

2024-05-18 Thread Pádraig Brady


On 17/05/2024 17:11, Audrey Dutcher wrote:

On my FreeBSD system, downloading coreutils-9.5.tar.xz, then building
with `./configure --enable-single-binary && make` does not succeed,
with the error message `don't know how to make -Wl,-rpath. Stop`

I believe the root cause of this is
https://github.com/coreutils/coreutils/blob/52e024b/src/local.mk#L485,
which mixes DEPENDENCIES and LDADD. This seems bad, for very relevant
reasons!


I presume you have the same issue with coreutils 9.4 ?

I presume the problematic flags are propagated through LIBINTL or MBRTOWC_LIB.
What are those set to for reference in your Makefile?

I'll have a look at cleaning this up.

thanks,
Pádraig

bug#70946: [PATCH] doc: improve the man page for sleep

2024-05-15 Thread Pádraig Brady


On 15/05/2024 03:46, Nikolaos Chatzikonstantinou wrote:

On Tue, May 14, 2024 at 4:03 PM Nikolaos Chatzikonstantinou
 wrote:


On Tue, May 14, 2024, 3:59 PM Pádraig Brady  wrote:


On 14/05/2024 17:36, Nikolaos Chatzikonstantinou wrote:

See attachment.


Well just above your new mention of floating-point, we have:
"NUMBER need not be an integer".
How about I adjust your patch to adjust that text to say:
"NUMBER can be an integer or floating-point".

Sounds good! Either I missed it or my system had an older man page.


If I may suggest, instead write it like this:

 Pause for NUMBER seconds, where NUMBER is an integer or floating-point.

Then the sentence "NUMBER need not be an integer" can be removed. Of
course the info manual shows more details, but I think that note in
the man page will do. Let me know if you'd rather that I write and
submit a patch myself instead.


OK I pushed a change in your name, changing from:

Pause for NUMBER seconds.  SUFFIX may be 's' for seconds (the default),'m' for 
minutes, 'h' for hours or 'd' for days.  NUMBER need not be an
integer.  Given two or more arguments, pause for the amount of time
specified by the sum of their values.

to:

Pause for NUMBER seconds, where NUMBER is an integer or floating-point.
SUFFIX may be 's','m','h', or 'd', for seconds, minutes, hours, days.
With multiple arguments, pause for the sum of their values.

Marking this as done.

thanks,
Pádraig

bug#70946: [PATCH] doc: improve the man page for sleep

2024-05-14 Thread Pádraig Brady


On 14/05/2024 17:36, Nikolaos Chatzikonstantinou wrote:

See attachment.


Well just above your new mention of floating-point, we have:
"NUMBER need not be an integer".
How about I adjust your patch to adjust that text to say:
"NUMBER can be an integer or floating-point".

cheers,
Pádraig.

bug#70887: In coreutils 9.5, "cp" command does not honor "--interactive" option

2024-05-12 Thread Pádraig Brady


On 12/05/2024 16:06, Paul Eggert wrote:

On 2024-05-12 04:49, Pádraig Brady wrote:


@@ -1151,7 +1151,8 @@ main (int argc, char **argv)
  {
/* Default cp operation.  */
x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
  }
else if (update_opt == UPDATE_NONE)
  {
@@ -1166,7 +1167,8 @@ main (int argc, char **argv)
else if (update_opt == UPDATE_OLDER)
  {
x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;


Thanks for looking into this messy area. Here is a comment from another
pair of eyes.

Could you elaborate a bit more about why these two bits of code change
x.interactive at all? That is, why doesn't update_opt simply affect
x.update? Why does update_opt bother to override a previous setting of
x.interactive to I_ALWAYS_YES, I_ALWAYS_NO, or I_ALWAYS_SKIP?

Another way to put it: shouldn't x.update simply reflect the value of
the --update option, whereas x.interactive reflects reflects whether -f,
-i, -n are used? Although this would require changes to copy.c, it'd
make the code easier to follow.


I agree that some refactoring would be good here.
At least x.update should be renamed to x.update_older.

As interactive selection, and file dates all relate
to selecting which files to update, it's tempting to conflate the settings.
However you're right that this introduces complexities when
trying to avoid all inconsistencies. Currently for example:
  $ cp -v -i --update=none new old  # Won't prompt as expected
  $ cp -v --update=none -i new old  # Unexpectedly ignores update option

So yes we should separate these things.


Another way to put it: why should, for example, --update=all override a
previous -f or (partly) -n but not a previous -i?


Right -f is significant for mv here (for completeness -f for cp is a separate 
thing).
I.e. we need to treat I_ALWAYS_YES specially in mv with the current scheme.

BTW -n is not overridden by any --update option currently,
and this change effectively applies the same logic to -i now.

thanks,
Pádraig

bug#70887: In coreutils 9.5, "cp" command does not honor "--interactive" option

2024-05-12 Thread Pádraig Brady


On 12/05/2024 00:03, Robert Hill wrote:

After upgrading coreutils from 9.0 to 9.5, the following change occurred:

In coreutils 9.0, the command "cp -Tipruvx /src-dir /dst-dir" requested
interactive confirmation before replacing an old destination file with a
newer source file, as expected.

In coreutils 9.5, the command "cp -Tipruvx /src-dir /dst-dir" no longer
requests interactive confirmation, but just goes ahead and replaces old
destination files with newer source files, which is not expected.

Thank you in advance for looking at this, Bob.


Right.

The thinking was for 9.3 that the new long form --update={older,all} options
would override a previous -i, especially as -i was commonly set in root
users' cp and mv aliases on Red Hat flavored distros.
Then in 9.5 we expanded this so -u behaved the same as --update=older.

In retrospect, users can avoid these aliases in various ways,
and the protective -i option should really combine with -u
rather than being overridden by it.

For completeness, -i following -u would always reinstate the protection.

The attached changes the behavior back to that -i is never overridden.

thanks,
PádraigFrom 2ec640a3d2719ac0204d4976baabe6082b85e502 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 12 May 2024 12:21:19 +0100
Subject: [PATCH] cp,mv: ensure -i is not overridden by -u

Have -i combine with -u instead.
In coreutils 9.3 we had --update={all,older} override -i
In coreutils 9.5 this was expanded to -u
(to make it consistent with --update=older).

* NEWS: Mention the bug fix.
* src/cp.c (main): Don't have -u disable prompting.
* src/mv.c (main): Likewise.
* tests/cp/cp-i.sh: Add a test case.
* tests/mv/update.sh: Likewise.
Fixes https://bugs.gnu.org/70887
---
 NEWS   |  3 +++
 src/cp.c   |  6 --
 src/mv.c   |  6 --
 tests/cp/cp-i.sh   | 13 +
 tests/mv/update.sh |  3 +++
 5 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index febb9ac68..430c2ec54 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,9 @@ GNU coreutils NEWS-*- outline -*-
   rejected as an invalid option.
   [bug introduced in coreutils-9.5]
 
+  cp,mv --update no longer overrides --interactive.
+  [bug introduced in coreutils-9.3]
+
   ls and printf fix shell quoted output in the edge case of escaped
   first and last characters, and single quotes in the string.
   [bug introduced in coreutils-8.26]
diff --git a/src/cp.c b/src/cp.c
index 06dbad155..cae4563cb 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -1151,7 +1151,8 @@ main (int argc, char **argv)
 {
   /* Default cp operation.  */
   x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
   else if (update_opt == UPDATE_NONE)
 {
@@ -1166,7 +1167,8 @@ main (int argc, char **argv)
   else if (update_opt == UPDATE_OLDER)
 {
   x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
 }
   break;
diff --git a/src/mv.c b/src/mv.c
index 692943a70..0162cd728 100644
--- a/src/mv.c
+++ b/src/mv.c
@@ -394,7 +394,8 @@ main (int argc, char **argv)
 {
   /* Default mv operation.  */
   x.update = false;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
   else if (update_opt == UPDATE_NONE)
 {
@@ -409,7 +410,8 @@ main (int argc, char **argv)
   else if (update_opt == UPDATE_OLDER)
 {
   x.update = true;
-  x.interactive = I_UNSPECIFIED;
+  if (x.interactive != I_ASK_USER)
+x.interactive = I_UNSPECIFIED;
 }
 }
   break;
diff --git a/tests/cp/cp-i.sh b/tests/cp/cp-i.sh
index f99f743dc..28969f9c8 100755
--- a/tests/cp/cp-i.sh
+++ b/tests/cp/cp-i.sh
@@ -70,4 +70,17 @@ returns_ 1 cp -bn c d 2>/dev/null || fail=1
 returns_ 1 cp -b --update=none c d 2>/dev/null || fail=1
 returns_ 1 cp -b --update=none-fail c d 2>/dev/null || fail=1
 
+# Verify -i combines with -u,
+echo old > old || framework_failure_
+touch -d yesterday old || framework_failure_
+echo new > new || framework_failure_
+# coreutils 9.3 had --update={all,older} ignore -i
+echo n | returns_ 1 cp -vi --update=older new old 2>/dev/null >out8 || fail=1
+compare /dev/null out8 || fail=1
+echo n | returns_ 1 cp -vi --update=all new old 2>/dev/null >out8 || fail=1
+compare /dev/null out8 || fail=1
+# coreutils 9.5 also had -u ignore -i
+echo n | returns_ 1 cp -vi -

bug#70873: join --return-error-if-any-unmatched-lines

2024-05-11 Thread Pádraig Brady


On 11/05/2024 10:14, Dan Jacobson wrote:

join should have an option to return an error value in the shell's $?
if any lines are not matched.

Currently the man page doesn't even mention a return value. So it is not
set in stone yet.

Currently one must save -v output in a file then use test -s do detect
if there were any non-matched lines. And then exit one script with
non-zero. Big hassle.

join (GNU coreutils) 9.4


This does seem to have some merit.
Perhaps --check-pairable similar to the existing --check-order option.

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-05-09 Thread Pádraig Brady


On 13/04/2024 18:42, Pádraig Brady wrote:

On 13/04/2024 17:39, Pádraig Brady wrote:

install(1) defaults to mode 600 for new files, and uses set_acl() with that
(since 2007 https://github.com/coreutils/coreutils/commit/f634e8844 )
The psuedo code that install(1) uses is:

copy_reg()
 if (x->set_mode) /* install */
   set_acl(dest, x->mode /* 600 */)
 ctx->acl = acl_from_mode ( /* 600 */)
 acl_set_fd (ctx->acl) /* fails EACCES */
 if (! acls_set)
must_chmod = true;
 if (must_chmod)
   saved_errno = EACCES;
   chmod (ctx->mode /* 600 */)
   if (save_errno)
 return -1;

This issue only only seems to be on CIFS.
I'm seeing lot of weird behavior with ACLs there:

 acl_set_fd (acl_from_mode (600)) -> EACCES
 acl_set_fd (acl_from_mode (755)) -> EINVAL
 getxattr ("system.posix_acl_access") -> EOPNOTSUPP

Note we ignore EINVAL and EOPNOTSUPP errors in set_acl(),
and it's just the EACCES that's problematic.
Note this is quite similar to https://debbugs.gnu.org/65599
where Paul also noticed EACCES with fsetxattr() (and others) on CIFS.

The attached is a potential solution which I tested as working
on the same matoro system that Bruno used.

I think I'll apply that after thinking a bit more about it.


Actually that probably isn't appropriate,
as fsetxattr() can validly return EACESS
even though not documented in the man page at least.
The following demonstrates that:
.
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
  int wfd;
  /* Note S_IWUSR is not set.  */
  if ((wfd=open("writable", O_CREAT|O_WRONLY|O_EXCL, S_IRUSR)) == -1)
  fprintf(stderr, "open('writable') error [%m]\n");
  if (write(wfd, "data", 1) == -1)
  fprintf(stderr, "write() error [%m]\n");
  if (fsetxattr(wfd, "user.test", "test", 4, 0) == -1)
  fprintf(stderr, "fsetxattr() error [%m]\n");
}

Another solution might be to file_has_acl() in copy.c
and skip if "not supported" or nothing is returned.
The following would do that, but I'm not sure about this
(as I'm not sure about ACLs in general TBH).
Note listxattr() returns 0 on CIFS here,
while getxattr ("system.posix_acl_access") returns EOPNOTSUPP,
and file_has_acl() uses listxattr() first.

diff --git a/src/copy.c b/src/copy.c
index d584a27eb..2145d89d5 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1673,8 +1673,13 @@ set_dest_mode:
   }
 else if (x->set_mode)
   {
-  if (set_acl (dst_name, dest_desc, x->mode) != 0)
-return_val = false;
+  errno = 0;
+  int n = file_has_acl (dst_name, &sb);
+  if (0 < n || (errno && ! (is_ENOTSUP (errno) || errno == ENOSYS)))
+{
+  if (set_acl (dst_name, dest_desc, x->mode) != 0)
+return_val = false;
+}
   }


BTW I'm surprised this wasn't reported previously for CIFS,
so I due to this bug and https://debbugs.gnu.org/65599
I suspect a recentish change in CIFS wrt EACCES.


Thinking more about this, the solution presented above wouldn't work,
and I think this needs to be addressed in CIFS.
I.e. we may still need to reset the mode even if the file has no ACLs,
as generally only dirs get default ACLs copied, as demonstrated below:

$ mkdir acl$ setfacl -d -m o::rw acl
$ getfacl acl
# file: acl
# owner: padraig
# group: padraig
user::rwx
group::r-x
other::r-x
default:user::rwx
default:group::r-x
default:other::rw-

$ touch acl/file
$ ls -l acl/file
-rw-r--rw-. 1 padraig padraig 0 May  9 13:11 acl/file
$ getfacl acl/file
# file: acl/file
# owner: padraig
# group: padraig
user::rw-
group::r--
other::rw-

cheers,
Pádraig.

bug#70814: pwd(1) man page: Unclear "avoid" wording

2024-05-07 Thread Pádraig Brady


On 07/05/2024 05:19, Bruce Jerrick wrote:

This wording in the pwd(1) man page is unclear:

     -P, --physical
     avoid all symlinks

"resolve all symlinks" would be much better wording.


Pushed that in your name.

Marking this as done.

thanks,
Pádraig

bug#70801: Weird behaviour when standard input is closed

2024-05-06 Thread Pádraig Brady


On 06/05/2024 08:38, Bernard Burette wrote:

Hi,



If I try:

$ cat <&-

cat: -: Bad file descriptor

cat: closing standard input: Bad file descriptor

$



The error on stdin beign closed is displayed twice plus "-" is for a FILE 
argument to replace standard input, It would make more sense to me to have someting like:

$ cat <&-

cat: standard input: Bad file descriptor

$



This is the "git diff" of my proposed change:

diff --git a/src/cat.c b/src/cat.c

index b33faeb35..4be189d85 100644

--- a/src/cat.c

+++ b/src/cat.c

@@ -50,6 +50,14 @@

/* Name of input file.  May be "-".  */

static char const *infile;



+/* Pretty name of input file */

+static char const *

+quotef_infile (void)

+{

+  if (STREQ (infile, "-")) return _("standard input");

+  return quotef (infile);

+}

+

/* Descriptor on which input file is open.  */

static int input_desc;



@@ -164,7 +172,7 @@ simple_cat (char *buf, idx_t bufsize)

    size_t n_read = safe_read (input_desc, buf, bufsize);

    if (n_read == SAFE_READ_ERROR)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    return false;

  }



@@ -313,7 +321,7 @@ cat (char *inbuf, idx_t insize, char *outbuf, idx_t outsize,

    size_t n_read = safe_read (input_desc, inbuf, insize);

    if (n_read == SAFE_READ_ERROR)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    write_pending (outbuf, &bpout);

    newlines2 = newlines;

    return false;

@@ -526,7 +534,7 @@ copy_cat (void)

  || errno == EBADF || errno == EXDEV || errno == ETXTBSY

  || errno == EPERM)

    return 0;

-    error (0, errno, "%s", quotef (infile));

+    error (0, errno, "%s", quotef_infile());

  return -1;

    }

}

@@ -684,7 +692,7 @@ main (int argc, char **argv)

    input_desc = open (infile, file_open_mode);

    if (input_desc < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

    continue;

  }

@@ -692,7 +700,7 @@ main (int argc, char **argv)



    if (fstat (input_desc, &stat_buf) < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

    goto contin;

  }

@@ -719,7 +727,7 @@ main (int argc, char **argv)

  }

    if (exhausting)

  {

-  error (0, 0, _("%s: input file is output file"), quotef 
(infile));

+  error (0, 0, _("%s: input file is output file"), 
quotef_infile());

    ok = false;

    goto contin;

  }

@@ -794,7 +802,7 @@ main (int argc, char **argv)

  contin:

    if (!reading_stdin && close (input_desc) < 0)

  {

-  error (0, errno, "%s", quotef (infile));

+  error (0, errno, "%s", quotef_infile());

    ok = false;

  }

  }

@@ -807,7 +815,8 @@ main (int argc, char **argv)

  }



    if (have_read_stdin && close (STDIN_FILENO) < 0)

-    error (EXIT_FAILURE, errno, _("closing standard input"));

+    if (ok)

+  error (EXIT_FAILURE, errno, _("closing standard input"));



    return ok ? EXIT_SUCCESS : EXIT_FAILURE;

}



There are a lot more of coreutil utilities that (could) behave weird on a 
closed standard input, here is my first list:

b2sum base32 base64 cat cksum comm csplit cut expand factor fmt fold head join 
md5sum nl nohup numfmt od paste pr ptx sha1sum sha224sum sha256sum sha384sum 
sha512sum shred shuf sort split stdbuf stty sum tac tail tee tr tsort unexpand 
uniq wc





I would like to work on those as well but it would help to have:

1) some kind of consensus on "this is a better way of displaying the error", 
for example other tools also go wild like:

  grep: (standard input): Bad file descriptor

  - why parentheses?

  sed: read error on stdin: Bad file descriptor

  - stdin is the C name for standard input, documented?

2) a better way of offering changes than sending a diff patch in a e-mail.


Yes it would be good to fix up at least the duplicated error.
We did the same on the write side of things with:
https://github.com/coreutils/coreutils/commit/0b2ff7637
Also we generally checked that readers diagnosed errors in:
https://github.com/coreutils/coreutils/blob/master/tests/misc/read-errors.sh

So something similar could probably be done here.

cheers,
Pádraig

bug#70727: cp doesn't recognize the new --update=none-fail option

2024-05-03 Thread Pádraig Brady


On 03/05/2024 05:12, Attila Fidan via GNU coreutils Bug Reports wrote:

Hi,

I wanted to use the new cp --update=none-fail option introduced in 9.5,
but it said "invalid argument ‘none-fail’ for ‘--update’". It turns out
that the commit (49912bac286eb3c0ef7d1567ae790193ad5eb2e8) adding it
forgot to add the new operation to update_type[] and
update_type_string[] in cp.c like it did for mv.c. After patching
coreutils locally the functionality works as expected.

It seems like the test suite didn't catch this because there's no
cp/update.sh test like there is for mv. There's a test for if using
--backup and --update=none-fail are mutually exclusive by checking if cp
returns 1, but an invalid argument also makes cp return 1 :)

I didn't include a patch in case a change to the test suite is wanted,
but the proposed change is tiny and rather obvious.


Well that's embarrassing.
I implemented in cp first, tested that manually,
then must have messed up that hunk when rebasing.

The attached should fix this.

Marking this as done.

thanks,
Pádraig.From de49e993ea8b6dcdf6cada3c0f44a6371514f952 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Fri, 3 May 2024 10:18:50 +0100
Subject: [PATCH] cp: actually support --update=none-fail

* src/cp.c: Add the entries for the --update=none-fail option.
* tests/mv/update.sh: Add a test case.
* NEWS: Mention the bug fix.
Fixes https://bugs.gnu.org/70727
---
 NEWS   |  4 
 src/cp.c   |  4 ++--
 tests/mv/update.sh | 11 +++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 389f72516..7e8ccb34f 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,10 @@ GNU coreutils NEWS-*- outline -*-
 
 ** Bug fixes
 
+  cp fixes support for --update=none-fail, which would have been
+  rejected as an invalid option.
+  [bug introduced in coreutils-9.5]
+
   ls and printf fix shell quoted output in the edge case of escaped
   first and last characters, and single quotes in the string.
   [bug introduced in coreutils-8.26]
diff --git a/src/cp.c b/src/cp.c
index 28b0217db..06dbad155 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -104,11 +104,11 @@ ARGMATCH_VERIFY (reflink_type_string, reflink_type);
 
 static char const *const update_type_string[] =
 {
-  "all", "none", "older", nullptr
+  "all", "none", "none-fail", "older", nullptr
 };
 static enum Update_type const update_type[] =
 {
-  UPDATE_ALL, UPDATE_NONE, UPDATE_OLDER,
+  UPDATE_ALL, UPDATE_NONE, UPDATE_NONE_FAIL, UPDATE_OLDER,
 };
 ARGMATCH_VERIFY (update_type_string, update_type);
 
diff --git a/tests/mv/update.sh b/tests/mv/update.sh
index 164357803..39ff677b9 100755
--- a/tests/mv/update.sh
+++ b/tests/mv/update.sh
@@ -38,6 +38,17 @@ for interactive in '' -i; do
   done
 done
 
+# These should accept all options
+for update_option in '--update' '--update=older' '--update=all' \
+ '--update=none' '--update=none-fail'; do
+
+  touch file1 || framework_failure_
+  mv $update_option file1 file2 || fail=1
+  test -f file1 && fail=1
+  cp $update_option file2 file1 || fail=1
+  rm file1 file2 || framework_failure_
+done
+
 # These should perform the rename / copy
 for update_option in '--update' '--update=older' '--update=all' \
  '--update=none --update=all'; do
-- 
2.44.0

bug#70714: realpath no error for unreadable-symlink

2024-05-02 Thread Pádraig Brady


On 02/05/2024 07:16, Nineteendo INC wrote:

coreutils version: stable 9.5 (bottled)
OS version: macOS 13.6.6 (22G630)

`realpath` doesn’t behave correctly for unreadable symlinks:

wannes@Stefans-iMac ~ % ln -s . src
wannes@Stefans-iMac ~ % grealpath -e src/..
/Users
wannes@Stefans-iMac ~ % chmod -h 000 src
wannes@Stefans-iMac ~ % grealpath -e src/..
/Users/wannes

Expected behaviour:

wannes@Stefans-iMac ~ % grealpath -e src/..
grealpath: src/..: Permission denied


Right, looks like we'll have to cater for EACCES on darwin.
I'll have a look

bug#70699: Possible bug in /usr/bin/paste

2024-05-01 Thread Pádraig Brady


On 01/05/2024 15:28, Art Shelest via GNU coreutils Bug Reports wrote:

Good morning,

I am seeing an aberrant behavior from the /usr/bin/paste utility when working 
with Windows-style CR/LF text files.
The repro is for Mint Mate (Virginia).

If I change the line endings in the first file to Unix format (LF), it works as 
expected.
If I change the line endings to Max (CR), it breaks even worse.

$ hexdump -C letters.txt
  61 61 09 41 41 0d 0a 62  62 09 42 42 0d 0a|aa.AA..bb.BB..|

$ cat letters.txt
aa   AA
bb   BB
$ cat numbers.txt
1
2
$ paste letters.txt numbers.txt
aa   1A
bb   2B
$


Expected:
$ paste letters.txt numbers.txt
aa   AA   1
bb   BB   2

Thank you.



paste(1) is treating the CR like a standard character,
and when outputting that back to the terminal it "messes up" the expected 
output.
I suggest you convert any such files to unix format before processing.

thanks,
Pádraig

bug#70532: sort: Mention counting fields from the end

2024-04-23 Thread Pádraig Brady


On 23/04/2024 11:14, Dan Jacobson wrote:

In (info "(coreutils) sort invocation") be sure to add an example of a
way or workaround for counting fields from the end of the line. E.g., we
want to sort on the last field, but don't know for sure how many fields
a line might contain. E.g., sort by surname, when lines consist of First
[Middle...] Surname. perl -a uses $F[-1]. so maybe sort(1) could also
use a negative field number. Same for character number.


All good suggestions. I'll at least add an example along the lines of:

  awk '{print $NF, $0}' | sort -k1,1 | cut -f2- -d' '

cheers,
Pádraig.

bug#70418: ls bug

2024-04-17 Thread Pádraig Brady


On 16/04/2024 23:17, Paul Eggert wrote:

On 4/16/24 14:30, Toby Kelsey wrote:

The man page doesn't explain this format conflict, while the info page
(info '(coreutils) ls invocation' or 'info ls') claims '-f' implies '-1'
which is also incorrect: 'ls -1f' gives different output to to 'ls -f'.


Yes, this area of GNU 'ls' a mess. Option order should not matter here.

Option order didn't matter in 7th Edition Unix, where -f overrode -l
regardless of whether -f came before or after -l. And option order
doesn't matter in FreeBSD, where -f and -l are orthogonal. GNU ls is an
odd hybrid of 7th Edition and FreeBSD and messes this up.

Rather than document the hybrid mess, let's bite the bullet and fix it.
FreeBSD behavior makes more sense, so let's do that. Proposed patch
attached.


+1

Related to this is the recent adjustment of usage() for -f:
https://bugs.gnu.org/67765

Patch looks good, and conforms to POSIX.

thanks!
Pádraig

bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady


On 16/04/2024 15:47, Alejandro Colomar wrote:

Hi Pádraig,

On Tue, Apr 16, 2024 at 03:25:22PM +0100, Pádraig Brady wrote:

What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently


I don't know.  The reporter didn't tell.  I see you also asked on the
Github original report.


Note that check originally came from:
https://github.com/coreutils/coreutils/commit/dea4262fa

I suppose we could relax the check as follows, for files of apparent size 0
which would cater for this, and others that may also have unstable inodes.


H.  Since you couldn't reprodude it in a recent Darwin, maybe it's
just a bug in an old Darwin.  And since noone else seems to have met
this Darwin's bug, maybe we can just ignore it.  (And if it were a
regression in a more recent Darwin, hopefully they should fix their
kernel.)

I'm not happy relaxing a security check, without making sure that there
are no implications at all.

I vote for claiming only limited support to such a Darwin system.  I
already workarounded it in the Linux man-pages, by not piping to
install(1) in a common task; and nobody else seems to be affected.

Unless you feel confident that it's perfectly fine to do it.  But I have
no sympathy for workarounding Darwin bugs here.


I agree if it's older Darwin only, we can ignore.
The version I tested on is 3 years old now though,
so I'm not sure whether the issue is on newer or older.

Note we had similar issue on Solaris,
where we used an fstat() wrapper to adjust things:
https://bugs.gnu.org/35713

A related suggestion was from Marc Chantreux (CC'd)
to support '-' to imply stdin, which would be more portable.
There is some merit to that suggestion too.

cheers,
Pádraig

bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady


On 16/04/2024 12:33, Pádraig Brady wrote:

On 16/04/2024 01:19, Alejandro Colomar wrote:

Hi!

I don't own a Darwin system, so I can't help much reproduce.  However,
I've received a bug report to the Linux man-pages, that our build
system (GNUmakefile-based), which ends up calling

... | install /dev/stdin $@

doesn't work on Darwin.  Here's the original bug report:
<https://github.com/NixOS/nixpkgs/pull/300797>.

Here are the reported error messages:

...
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addseverity.3
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/adjtime.3
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addmntent.3]
 Error 1
make: *** Waiting for unfinished jobs
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/acosh.3]
 Error 1
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
...

I don't see why install(1) should fail to read /dev/stdin under any
POSIX system


What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently


Note that check originally came from:
https://github.com/coreutils/coreutils/commit/dea4262fa

I suppose we could relax the check as follows, for files of apparent size 0
which would cater for this, and others that may also have unstable inodes.

cheers,
Pádraig.

diff --git a/src/copy.c b/src/copy.c
index 2145d89d5..fb5f0a1a0 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1263,8 +1263,11 @@ copy_reg (char const *src_name, char const *dst_name,
 }

   /* Compare the source dev/ino from the open file to the incoming,
- saved ones obtained via a previous call to stat.  */
-  if (! psame_inode (src_sb, &src_open_sb))
+ saved ones obtained via a previous call to stat.  Restrict
+ the check to files with an apparent size, to support "files"
+ with unstable inodes, like /dev/stdin on macOS.  */
+  if (! psame_inode (src_sb, &src_open_sb)
+  && (src_sb->st_size || src_open_sb.st_size))
 {
   error (0, 0,
  _("skipping file %s, as it was replaced while being copied"),

bug#70411: [bug] install(1) fails to read /dev/stdin on Darwin

2024-04-16 Thread Pádraig Brady


On 16/04/2024 01:19, Alejandro Colomar wrote:

Hi!

I don't own a Darwin system, so I can't help much reproduce.  However,
I've received a bug report to the Linux man-pages, that our build
system (GNUmakefile-based), which ends up calling

... | install /dev/stdin $@

doesn't work on Darwin.  Here's the original bug report:
.

Here are the reported error messages:

...
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addseverity.3
INSTALL 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/adjtime.3
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/addmntent.3]
 Error 1
make: *** Waiting for unfinished jobs
install: skipping file '/dev/stdin', as it was replaced while being copied
make: *** [share/mk/install/man.mk:54: 
/nix/store/3s28l9ijlkmsq8256zdxjvl173gkn37c-man-pages-6.7/share/man/man3/acosh.3]
 Error 1
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
install: skipping file '/dev/stdin', as it was replaced while being copied
...

I don't see why install(1) should fail to read /dev/stdin under any
POSIX system


What version of darwin is this? I can't repro on Darwin 21.6.0 (MacOSX 12.6).
The issue seems to be that /dev/stdin returns a varying inode which install(1) 
doesn't like currently

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-15 Thread Pádraig Brady


On 15/04/2024 15:37, Andreas Grünbacher wrote:

Hello,

Am So., 14. Apr. 2024 um 00:43 Uhr schrieb Pádraig Brady :

On 13/04/2024 20:29, Bruno Haible wrote:

Hi Pádraig,

I wrote:

5) The same thing with 'cp -a' succeeds:

$ build-sparc64/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0
$ build-sparc64-no-acl/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0


You wrote:

The psuedo code that install(1) uses is:

copy_reg()
 if (x->set_mode) /* install */
   set_acl(dest, x->mode /* 600 */)
 ctx->acl = acl_from_mode ( /* 600 */)
 acl_set_fd (ctx->acl) /* fails EACCES */
 if (! acls_set)
must_chmod = true;
 if (must_chmod)
   saved_errno = EACCES;
   chmod (ctx->mode /* 600 */)
   if (save_errno)
 return -1;


And, for comparison, what is the pseudo-code that 'cp -a' uses?
I would guess that there must be a relevant difference between both.


The cp pseudo code is:

copy_reg()
if (preserve_xattr)
  copy_attr()
ret = attr_copy_fd()
if (ret == -1 && require_preserve_xattr /*false*/)
  return failure;
if (preserve_mode)
  copy_acl()
qcopy_acl()
  #if USE_XATTR /* true */
fchmod() /* chmod before setting ACLs as doing after may reset */
return attr_copy_fd() /* successful if no ACLs in source */
  #endif

If however you add ACLs in the source, you induce a similar failure:

$ setfacl -m u:nobody:r /var/tmp/foo3942
$ src/cp -a /var/tmp/foo3942 foo3942; echo $?
src/cp: preserving permissions for ‘foo3942’: Permission denied
1

The corresponding strace is:

fchmod(4, 0100640)  = 0
flistxattr(3, NULL, 0)  = 24
flistxattr(3, "system.posix_acl_access\0", 24) = 24
fgetxattr(3, "system.posix_acl_access", NULL, 0) = 44
fgetxattr(3, "system.posix_acl_access", "\2\0...\4", 44) = 44
fsetxattr(4, "system.posix_acl_access", "\2\0...\4", 44, 0) = -1 EACCES 
(Permission denied)


Why does CIFS think EACCES is an appropriate error to return here? The
fchmod() succeeds, so changing the file permissions via fsetxattr()
should really succeed as well.


Right, it seems like a CIFS bug (already CC'd)
Even if it returned ENOTSUP (like getxattr(...posix...) does) it would be ok as 
we handle that.
It would be good to avoid it though.

You confirmed privately to me that the set_acl() is to clear any default ACLs
that may have been added to the newly created file, so the posted solution in
https://bugs.gnu.org/70214#11 that only does the set_acl() iff file_has_acl()
should avoid the CIFS issue.  It would be a bit less efficient if there were
ACLs, but a bit more efficient in the common case where there weren't ACLs.

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady


On 13/04/2024 20:29, Bruno Haible wrote:

Hi Pádraig,

I wrote:

5) The same thing with 'cp -a' succeeds:

$ build-sparc64/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0
$ build-sparc64-no-acl/src/cp -a /var/tmp/foo3941 $HOME/foo3941; echo $?
0


You wrote:

The psuedo code that install(1) uses is:

copy_reg()
if (x->set_mode) /* install */
  set_acl(dest, x->mode /* 600 */)
ctx->acl = acl_from_mode ( /* 600 */)
acl_set_fd (ctx->acl) /* fails EACCES */
if (! acls_set)
   must_chmod = true;
if (must_chmod)
  saved_errno = EACCES;
  chmod (ctx->mode /* 600 */)
  if (save_errno)
return -1;


And, for comparison, what is the pseudo-code that 'cp -a' uses?
I would guess that there must be a relevant difference between both.


The cp pseudo code is:

copy_reg()
  if (preserve_xattr)
copy_attr()
  ret = attr_copy_fd()
  if (ret == -1 && require_preserve_xattr /*false*/)
return failure;
  if (preserve_mode)
copy_acl()
  qcopy_acl()
#if USE_XATTR /* true */
  fchmod() /* chmod before setting ACLs as doing after may reset */
  return attr_copy_fd() /* successful if no ACLs in source */
#endif

If however you add ACLs in the source, you induce a similar failure:

$ setfacl -m u:nobody:r /var/tmp/foo3942
$ src/cp -a /var/tmp/foo3942 foo3942; echo $?
src/cp: preserving permissions for ‘foo3942’: Permission denied
1

The corresponding strace is:

fchmod(4, 0100640)  = 0
flistxattr(3, NULL, 0)  = 24
flistxattr(3, "system.posix_acl_access\0", 24) = 24
fgetxattr(3, "system.posix_acl_access", NULL, 0) = 44
fgetxattr(3, "system.posix_acl_access", "\2\0...\4", 44) = 44
fsetxattr(4, "system.posix_acl_access", "\2\0...\4", 44, 0) = -1 EACCES 
(Permission denied)

BTW I was wondering about the need for install(1) to set_acl() at all,
rather than just using chmod.
The following comment in lib/set-permissions.c may be pertinent:

/* If we can't set an acl which we expect to be able to set, try setting
   the permissions to ctx->mode. Due to possible inherited permissions,
   we cannot simply chmod */

BTW this is all under kernel version:

$ uname -r
6.8.5-gentoo-sparc64

With these cifs options:

$ mount | grep cifs
//syslog.matoro.tk/guest-pixelbeat on /media/guest-homedirs/pixelbeat type cifs
(rw,nosuid,relatime,vers=1.0,cache=strict,username=nobody,uid=30017,forceuid,
gid=30017,forcegid,addr=fd05:::::::0001,
soft,unix,posixpaths,serverino,mapposix,acl,
rsize=1048576,wsize=65536,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady


On 13/04/2024 17:39, Pádraig Brady wrote:

install(1) defaults to mode 600 for new files, and uses set_acl() with that
(since 2007 https://github.com/coreutils/coreutils/commit/f634e8844 )
The psuedo code that install(1) uses is:

copy_reg()
if (x->set_mode) /* install */
  set_acl(dest, x->mode /* 600 */)
ctx->acl = acl_from_mode ( /* 600 */)
acl_set_fd (ctx->acl) /* fails EACCES */
if (! acls_set)
   must_chmod = true;
if (must_chmod)
  saved_errno = EACCES;
  chmod (ctx->mode /* 600 */)
  if (save_errno)
return -1;

This issue only only seems to be on CIFS.
I'm seeing lot of weird behavior with ACLs there:

acl_set_fd (acl_from_mode (600)) -> EACCES
acl_set_fd (acl_from_mode (755)) -> EINVAL
getxattr ("system.posix_acl_access") -> EOPNOTSUPP

Note we ignore EINVAL and EOPNOTSUPP errors in set_acl(),
and it's just the EACCES that's problematic.
Note this is quite similar to https://debbugs.gnu.org/65599
where Paul also noticed EACCES with fsetxattr() (and others) on CIFS.

The attached is a potential solution which I tested as working
on the same matoro system that Bruno used.

I think I'll apply that after thinking a bit more about it.


Actually that probably isn't appropriate,
as fsetxattr() can validly return EACESS
even though not documented in the man page at least.
The following demonstrates that:
.
#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(void)
{
int wfd;
/* Note S_IWUSR is not set.  */
if ((wfd=open("writable", O_CREAT|O_WRONLY|O_EXCL, S_IRUSR)) == -1)
fprintf(stderr, "open('writable') error [%m]\n");
if (write(wfd, "data", 1) == -1)
fprintf(stderr, "write() error [%m]\n");
if (fsetxattr(wfd, "user.test", "test", 4, 0) == -1)
fprintf(stderr, "fsetxattr() error [%m]\n");
}

Another solution might be to file_has_acl() in copy.c
and skip if "not supported" or nothing is returned.
The following would do that, but I'm not sure about this
(as I'm not sure about ACLs in general TBH).
Note listxattr() returns 0 on CIFS here,
while getxattr ("system.posix_acl_access") returns EOPNOTSUPP,
and file_has_acl() uses listxattr() first.

diff --git a/src/copy.c b/src/copy.c
index d584a27eb..2145d89d5 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1673,8 +1673,13 @@ set_dest_mode:
 }
   else if (x->set_mode)
 {
-  if (set_acl (dst_name, dest_desc, x->mode) != 0)
-return_val = false;
+  errno = 0;
+  int n = file_has_acl (dst_name, &sb);
+  if (0 < n || (errno && ! (is_ENOTSUP (errno) || errno == ENOSYS)))
+{
+  if (set_acl (dst_name, dest_desc, x->mode) != 0)
+return_val = false;
+}
 }


BTW I'm surprised this wasn't reported previously for CIFS,
so I due to this bug and https://debbugs.gnu.org/65599
I suspect a recentish change in CIFS wrt EACCES.

cheers,
Pádraig

bug#70214: 'install' fails to copy regular file to autofs/cifs, due to ACL or xattr handling

2024-04-13 Thread Pádraig Brady

in some cases.
---
 ChangeLog | 7 +++
 lib/set-permissions.c | 8 +++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index c72165e268..fd094d1091 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2024-04-13  Pádraig Brady  
+
+	acl-permissions: avoid erroneous failure on CIFS
+	* lib/set-permissions.c (set_acls): On Linux (and others), also discount
+	EACESS as a valid errno with FD operations, as CIFS was seen to
+	return that erroneously in some cases.
+
 2024-04-13  Bruno Haible  
 
 	gnulib-tool.py: Code tweak.
diff --git a/lib/set-permissions.c b/lib/set-permissions.c
index 83a355faa5..7f8e55f5cd 100644
--- a/lib/set-permissions.c
+++ b/lib/set-permissions.c
@@ -520,7 +520,13 @@ set_acls (struct permission_context *ctx, const char *name, int desc,
 ret = acl_set_file (name, ACL_TYPE_ACCESS, ctx->acl);
   if (ret != 0)
 {
-  if (! acl_errno_valid (errno))
+  if (! acl_errno_valid (errno)
+#   if defined __linux__
+  /* Special case EACCES for fd operations as CIFS
+ was seen to erroneously return that in some cases.  */
+  || (HAVE_ACL_SET_FD && desc != -1 && errno == EACCES)
+#   endif
+ )
 {
   ctx->acls_not_supported = true;
   if (from_mode || acl_access_nontrivial (ctx->acl) == 0)
-- 
2.44.0

bug#35247: [PATCH] Add alacritty to terminal list supporting colors

2024-04-10 Thread Pádraig Brady


Things have changed a little since the original request.
alacritty sets $COLORTERM, and dircolors now auto accepts that since:
https://github.com/coreutils/coreutils/commit/75c9fc674

There are some complications with remote shells, but
they should boil down to setting up ssh to send/accept COLORTERM.

For some background for setting colors based on env vars
from the Fedora side of things at least, see:
https://fedoraproject.org/wiki/Features/256_Color_Terminals
https://fedoraproject.org/wiki/Changes/Drop_256term.sh

cheers,
Pádraig.

bug#70298: uname -i returns unknown since fedora 38

2024-04-09 Thread Pádraig Brady

On 09/04/2024 10:17, Collin Funk wrote:

On 4/9/24 12:57 AM, Paul Eggert wrote:

Indeed there is, and I merged your bug report into that old one. It'd be nice
if someone could get to the bottom of that bug.

I decided to look into this a bit, since I also have an unknown
'uname -i' and 'uname -p'.

It seems that this option is a Solaris thing. I found the commit that
it was introduced [1]. It also adds some other Solaris compatibility
stuff, so everything seems to add up so far.

The first function it tries is sysinfo (SI_PLATFORM, ...) which seems
to be a Solaris thing [2]. I think Linux has sysinfo but not
SI_PLATFORM.

Then it tries sysctl (...) with a UNAME_HARDWARE_PLATFORM macro to deal
with the BSDs. I think OpenBSD might be missing that definition in the
#ifdef, but I have no way of testing it at the moment [3].

I'm not sure if 'uname -i', 'uname -p', or 'uname -m' are supposed to
be any different from each other. Maybe someone who knows Solaris can
help more?

Illumios defines 'sysinfo' as an alias to 'systeminfo' [4]. There it
returns an 'extern char *platform' in the given buffer when passed
SI_PLATFORM [5]. I've found the definition of the variable, but I
don't know where it is actually set to something useful [6]. Hopefully
that information helps someone...

[1]
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=aaeb7a61c4662fd28cf2bc161b740352711538d2
[2]
https://github.com/illumos/illumos-gate/blob/cf618897f43ea305e3a426f93bbcef4e8106829c/usr/src/uts/common/sys/systeminfo.h#L86
[3] https://git.savannah.gnu.org/cgit/coreutils.git/tree/src/uname.c#n36
[4]
https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/common/sys/sysinfo.S
[5]
https://github.com/illumos/illumos-gate/blob/cf618897f43ea305e3a426f93bbcef4e8106829c/usr/src/uts/common/syscall/systeminfo.c#L124
[6]
https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/conf/param.c#L533

Thanks for looking at this.
From the Fedora side, they dropped a Fedora specific patch for Fedora 38.

https://bugzilla.redhat.com/show_bug.cgi?id=548834
https://bugzilla.redhat.com/show_bug.cgi?id=2208048

cheers,
Pádraig.

bug#70231: Performance issue on sort with zero-sized pseudo files

2024-04-07 Thread Pádraig Brady


On 06/04/2024 23:22, Paul Eggert wrote:

On 2024-04-06 03:09, Pádraig Brady wrote:

I'll apply this.


Heh, I beat you to it by looking for similar errors elsewhere and
applying the attached patches to fix the issues I found. None of them
look like serious bugs.


Cool. I thought the sort(1) change worthy of a NEWS entry so pushed one.
Marking this as done.


BTW we should improve sort buffer handling in general


Oh yes.

PS. My current little task is to get i18n to work better with 'sort'.
Among other things I want Unicode-style full case folding.


Excellent, that will help keep the related uniq(1) and join(1)
commands more aligned in their ordering.

cheers,
Pádraig

bug#70219: Bug/Issue with timeout and signals

2024-04-06 Thread Pádraig Brady


tag 70219 notabug
close 70219
stop

On 06/04/2024 16:50, Branden R. Williams via GNU coreutils Bug Reports wrote:

   -k, --kill-after=DURATION
  also send a KILL signal if COMMAND is still running
this long after the initial signal was sent


If you read the above carefully, please note the words _also_ and _initial_.
I.e. -k will cause _another_ signal to be sent, iff the first doesn't terminate 
the command.

Hopefully this example with all pertinent options explains things.
I.e. the first signal sent after 1s is ignored,
and the second kill signal sent after another 2s forcefully kills the command.

  $ date +%s; timeout -k 2s -s0 1s sleep inf; date +%s
  1712427916
  Killed
  1712427919

cheers,
Pádraig

bug#70231: Performance issue on sort with zero-sized pseudo files

2024-04-06 Thread Pádraig Brady


On 06/04/2024 03:52, Takashi Kusumi wrote:

Hi,

I have found a performance issue with the sort command when used on
pseudo files with zero size. For instance, sorting `/proc/kallsyms`, as
demonstrated below, takes significantly longer than executing with
`cat`, generating numerous temporary files. I confirmed this issue on
v8.32 as well as on commit 8f3989d in the master branch.

$ time cat /proc/kallsyms | sort > /dev/null
real0m0.954s
user0m0.873s
sys 0m0.096s

$ time sort /proc/kallsyms > /dev/null
real0m8.555s
user0m3.367s
sys 0m5.064s

$ strace -e trace=openat sort /proc/kallsyms 2>&1 > /dev/null \
  | grep /tmp/sort | head -100
...
openat(AT_FDCWD, "/tmp/sortM6Y6Y1", ...
openat(AT_FDCWD, "/tmp/sortPrHKMG", ...

$ strace -e trace=openat -c sort /proc/kallsyms > /dev/null
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
100.006.419777  19333258 8 openat
-- --- --- - - 
100.006.419777  19333258 8 total

It appears that the buffer size allocated for pseudo files with zero
size is insufficient, likely because it is based on their file size,
which is zero. As seen in the attached patch, I think using
`INPUT_FILE_SIZE_GUESS` to calculate the buffer size when the file size
is zero would resolve this issue.


I'll apply this.

BTW we should improve sort buffer handling in general. From my TODO...

0. Have sort --debug output memory buffer sizes and space avail at $TMPDIR(s)
1. auto increase buffer when reading from pipe or zero sized files.
This will be more efficient and more importantly enable parallel operation.
See http://superuser.com/questions/938558/sort-parallel-isnt-parallelizing/
At least your more appropriate default buffer sizes in this case.
I.e. bigger mins and probably smaller maxs as half avail mem is too aggressive.
2. check() should not need full buffer size?
only merge buffer size or something small at least.
3. Look at minimizing the amount of mem used by default.
Hmm, sort auto adjusts down to avail mem in initbuf() (Test with ulimit -v)
4. Careful with too small buffers as that may initiate
an extra merge step (see section above).

If anyone wants to look at the above give me a heads up,
or I'll get to it sometime in the next release cycle.

thanks!
Pádraig.

bug#70126: Missing a full stop in a coreutils-9.5-pre2 message

2024-04-01 Thread Pádraig Brady


On 01/04/2024 16:30, Petr Pisar wrote:

Hello,

while translating coreutils-9.5-pre2 I noticed this message:

#: src/chown.c:123
msgid ""
"  --reference=RFILE  use RFILE's ownership rather than specifying values\n"
" RFILE is always dereferenced if a symbolic link.\n"

I believe that there is missing a full stop after "values" on the first line.


Fixed.
Marking this as done.

thanks!
Pádraig

bug#70104: "FAIL: test-canonicalize" coreutils v9.5 on musl libc

2024-03-31 Thread Pádraig Brady


On 31/03/2024 10:02, Adept's Lab wrote:

test-canonicalize.c:411: assertion 'strcmp (result1, "//") == 0' failed

^ the only error log message I get. Fail was not presented with previous
stable versions.


This is on musl 1.1.24 as detailed at:
https://github.com/coreutils/coreutils/issues/83

CC'ing bug-gnulib

cheers,
Pádraig

bug#70070: FAIL: test-localtime_r

2024-03-29 Thread Pádraig Brady


On 29/03/2024 12:40, Andreas Schwab wrote:

FAIL: test-localtime_r
==

test-localtime_r.c:58: assertion 'result->tm_hour == 18' failed
FAIL test-localtime_r (exit status: 134)

FAIL: test-localtime_r-mt
=

thread2 disturbed by thread1!
thread1 disturbed by thread2!
FAIL test-localtime_r-mt (exit status: 134)



CC'ing gnulib as those tests are from gnulib.
It may help to detail your platform.

thanks,
Pádraig.

bug#69995: Untranslatable string

2024-03-25 Thread Pádraig Brady


On 25/03/2024 14:02, Göran Uddeborg wrote:

While translating the new version of coreutils to Swedish, I came
across this code from the end of chown-core.h:

   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the %sgroup of each file only if\n\
  its current owner and/or group match those 
specified\n\
  here.  Either may be omitted, in which case a match\n\
  is not required for the omitted attribute\n\
"), user ? "owner and/or " : "");

As this stands, the part "owner and/or " will not be translated.

Simply marking that part too for translation is not a good solution, even
if it would happen to work for Swedish. See
https://www.gnu.org/savannah-checkouts/gnu/gettext/manual/gettext.html#Entire-sentences


Fixed with:
https://github.com/coreutils/coreutils/commit/26fd96a96

thanks,
Pádraig

bug#69985: Untranslatable message in src/chown-core.h:95

2024-03-24 Thread Pádraig Brady


On 24/03/2024 16:57, Frédéric Marchal wrote:

Hi,

In src/chown-core.h:95 (coreutils-9.5-pre1), the following message is
impossible to translate as it is created by concatenating string fragments:

static inline void
emit_from_option_description (bool user)
{
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the %sgroup of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"), user ? "owner and/or " : "");
}

A translatable message must *never ever* be produced by concatenating
substrings. Even if the substrings are themselves translated, there is no
guaranties that the order of the words is the same in every language.

There is, unfortunately, no easy solution that save keystrokes. It must be
written like this:

if (user)
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the owner and/or group of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"));
else
   printf (_("\
   --from=CURRENT_OWNER:CURRENT_GROUP\n\
  change the group of each file only if\n\
  its current owner and/or group match those
specified\n\
  here.  Either may be omitted, in which case a
match\n\
  is not required for the omitted attribute\n\
"));

That make sense as both are distinct messages with different purposes.

I also suspect that the two messages should be even more different than they
currently are. The message, when user is false, suspiciously contains one more
"owner and/or" that is not removed and it says "Either may be omitted" when
only the group is mentioned at the beginning of the message.


The composed strings are accurate, but yes not translatable.
The attached simplifies the description a little,
so there is a single string to cater for both chown and chgrp.

Marking this as done.

thanks,
Pádraig
From baea9881f79690552b0aa3ca6844bbf64fa6e55e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sun, 24 Mar 2024 20:12:53 +
Subject: [PATCH] doc: fix translation issue in chown/chgrp amalgamation

* src/chown-core.h (emit_from_option_description): The conditional
string composition here caused issues for translators.
Instead move to a more general description ...
(src/chown.c (usage): ... here.
Fixes https://bugs.gnu.org/69985
---
 src/chown-core.h | 12 
 src/chown.c  |  8 +++-
 2 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/src/chown-core.h b/src/chown-core.h
index e1396e3ea..4bd68fed4 100644
--- a/src/chown-core.h
+++ b/src/chown-core.h
@@ -89,16 +89,4 @@ chown_files (char **files, int bit_flags,
  struct Chown_option const *chopt)
   _GL_ATTRIBUTE_NONNULL ();
 
-static inline void
-emit_from_option_description (bool user)
-{
-  printf (_("\
-  --from=CURRENT_OWNER:CURRENT_GROUP\n\
- change the %sgroup of each file only if\n\
- its current owner and/or group match those specified\n\
- here.  Either may be omitted, in which case a match\n\
- is not required for the omitted attribute\n\
-"), user ? "owner and/or " : "");
-}
-
 #endif /* CHOWN_CORE_H */
diff --git a/src/chown.c b/src/chown.c
index 90ce84d67..3194dc75f 100644
--- a/src/chown.c
+++ b/src/chown.c
@@ -109,7 +109,13 @@ With --reference, change the group of each FILE to that of RFILE.\n\
  (useful only on systems that can change the\n\
  ownership of a symlink)\n\
 "), stdout);
-  emit_from_option_description (chown_mode == CHOWN_CHOWN);
+  fputs (_("\
+  --from=CURRENT_OWNER:CURRENT_GROUP\n\
+ change the ownership of each file only if its\n\
+ current owner and/or group match those specified here.\n\
+ Either may be omitted, in which case a match\n\
+ is not required for the omitted attribute\n\
+"), stdout);
   fputs (_("\
   --no-preserve-root  do not treat '/' specially (the default)\n\
   --preserve-rootfail to operate recursively on '/'\n\
-- 
2.44.0

bug#69979: [PATCH-v2][doc] ls -l doc and size -> maj, min for device files

2024-03-24 Thread Pádraig Brady


On 24/03/2024 12:40, Stephane Chazelas wrote:

Tags: patch

My bad, the patch was incorrect, it should have said
"replaced by the corresponding device major and minor numbers as two decimal
numbers separated by a comma and at least one space.", as
there's not always only once space between the major and minor.

See also
https://unix.stackexchange.com/questions/773014/where-do-i-find-documentation-for-the-output-of-ls-l
where the issue was reported and where the variations of output
formats between ls implementations is being mentioned.

New patch attached.


Pushed.
Marking this as done.

thanks,
Pádraig

bug#69951: coreutils: printf formatting bug for nb_NO and nn_NO locales

2024-03-23 Thread Pádraig Brady


tag 69951 notabug
close 69951
stop

On 22/03/2024 20:22, Thomas Dreibholz wrote:

Hi,

I just discovered a printf bug for at least the nb_NO and nn_NO locales
when printing numbers with thousands separator. To reproduce:

#!/bin/bash
for l in de_DE nb_NO ; do
     echo "LC_NUMERIC=$l.UTF-8"
     for n in 1 100 1000 1 10 100 1000 ; do
    LC_NUMERIC=$l.UTF-8 /usr/bin/printf "<%'10d>\n" $n
     done
done

The expected output of "%'10d" is a right-formatted number string with
10 characters.

The output of the test script is fine for e.g. LC_NUMERIC=de_DE.UTF-8
and LC_NUMERIC=en_US.UTF-8:

LC_NUMERIC=de_DE.UTF-8
< 1>
<   100>
< 1.000>
<    10.000>
<   100.000>
< 1.000.000>
<10.000.000>



However, for LC_NUMERIC=nb_NO.UTF-8 and LC_NUMERIC=nn_NO.UTF-8, the
formatting is wrong:

LC_NUMERIC=nb_NO.UTF-8
< 1>
<   100>
<   1 000>
<  10 000>
< 100 000>
<1 000 000>
<10 000 000>



I reproduced the issue with coreutils-8.32-4.1ubuntu1.1 (Ubuntu 22.04)
as well as coreutils-9.3-5.fc39.x86_64 (Fedora 39).

Under FreeBSD 14.0-RELEASE (coreutils-9.4_1), the output looks slightly
better but is still wrong:

LC_NUMERIC=nb_NO.UTF-8
< 1>
<   100>
<    1 000>
<   10 000>
<  100 000>
<1 000 000>
<10 000 000>
LC_NUMERIC=nn_NO.UTF-8
< 1>
<   100>
<    1 000>
<   10 000>
<  100 000>
<1 000 000>
<10 000 000>

May be the issue is that the thousands separator for the Norwegian
locales is a space " ", while it is "."/"," for German/US English locales.


The issue looks to be that the thousands separator for Norwegian locales
is “NARROW NO-BREAK SPACE", or more problematically the _three_ byte
UTF8 sequence E2 80 AF. So it looks like an issue with libc routines
counting bytes rather than characters in this case.

One suggestion is to do the alignment after. For example:

$ export LC_NUMERIC=nb_NO.UTF-8
$ printf "%'.f\n" $(seq -f '1E%.f' 7) | column --table-right=1 -t
10
   100
 1 000
10 000
   100 000
 1 000 000
10 000 000

Actually I've just noticed that specifying the %'10.f format
does count characters and not bytes! So another solution is:

$ export LC_NUMERIC=nb_NO.UTF-8
$ printf "%'10.f\n" $(seq -f '1E%.f' 7)
10
   100
 1 000
10 000
   100 000
 1 000 000
10 000 000

The issue if there is one is in libc at least.
It would be worth checking existing glibc reports about this
and reporting if not mentioned.

cheers,
Pádraig.

bug#69939: test utility bug: "test -a -a -a" and "test -o -o -o" fail

2024-03-22 Thread Pádraig Brady


On 22/03/2024 11:20, Vincent Lefevre wrote:

With GNU Coreutils 9.4, both "test -a -a -a" and "test -o -o -o" fail:

$ export POSIXLY_CORRECT=1
$ /usr/bin/test -a -a -a ; echo $?
/usr/bin/test: ‘-a’: unary operator expected
2
$ /usr/bin/test -o -o -o ; echo $?
/usr/bin/test: ‘-o’: unary operator expected
2

According to POSIX, they should return 0.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
says for 3 arguments:

   If $2 is a binary primary, perform the binary test of $1 and $3.

Here, $2 is -a and -o respectively, which are binary primaries.
And both $1 and $3 are non-null strings.


Agreed. Any leading '-' triggers this:

  $ env test -. -a -.
  test: ‘-.’: unary operator expected

Note we already advise (as does POSIX) to avoid '-a' and '-o':
https://debbugs.gnu.org/22909

BTW POSIX advises to use parenthesis to avoid some parsing ambiguities,
and it helps in this case too:

  $ env test \( -a \) -a \( -a \); echo $?
  0

Though I see bash 5.2.26 has issue with it :/

  $ test \( -a \) -a \( -a \)
  bash: test: `)' expected

bash doesn't like the -a in particular, and is ok with other strings:

  $ test \( -. \) -a \( -. \); echo $?
  0

cheers,
Pádraig

bug#69807: questioning automatic -i in multicolumn pr

2024-03-21 Thread Pádraig Brady


On 14/03/2024 20:31, Douglas McIlroy wrote:

Multicolumn options in pr imply option -i (tabification). The introduction
of tabs with physical rather than logical meaning makes output that is OK
for viewing only if you have correct tab stops, and is complicated for
further processing.  It caters for obsolete equipment--typewriters, on
which tabbing was appreciably faster than spacing.

As a wish-list item I propose abolishing implicit tabification. A second
choice (that doesn't buck Posix) is to provide an option to suppress
implicit tabification. At a bare minimum, document a workaround for the
inconvenient tabs, e.g. post-processing with pr -t -e.


Good call on the documentation.  I'll add this now:

commit 91e69cd2d02f015fc296e02388e0b18a293faa56 (HEAD -> master)
Author: Pádraig Brady 
Date:   Thu Mar 21 15:26:48 2024 +

doc: pr: give solution to expanding TABs in multicolumn output

* doc/coreutils.texi (pr invocation): Explicitly state that
multicolumn output will convert spaces to TABs, and show that
this can be undone with the `pr -t -e` or `expand` commands.
Suggested by Douglas McIlroy in https://bugs.gnu.org/69807

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index e36269588..37d729089 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -2636,9 +2636,11 @@ This option might well cause some lines to be truncated.>
 lines in the columns on each page are balanced.  The options @option{-e}
 and @option{-i} are on for multiple text-column output.  Together with
 @option{-J} option column alignment and line truncation is turned off.
+Since spaces are converted to TABs in multicolumn output, they can be converted
+back by further processing through @command{pr -t -e} or @command{expand}.


thanks,
Pádraig

bug#10311: RFE: Give chmod a "-h" option as well

2024-03-20 Thread Pádraig Brady


On 16/12/2011 16:29, Jan Engelhardt wrote:

Hi,

chown(1) has a -h option by which it affects symlinks directly rather
than the pointed-to file. The bonus side effect is that the
pointed-to files don't get changed in any way, which is kinda welcome
if you attempt to "fix" permissions/ownership in a directory where an
evil user could create a symlink to e.g. /etc/shadow.

Attempting chmod -R g+w /home/groups/evilgroup is still a risk, and
would necessity a more long-winded command involving find(1). It
would therefore be welcome that chmod receive an -h option that just
skips over them (besides perhaps attempting to change their
permissions as well).


Pushed at
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v9.4-162-g07a69fc3b

Marking as done.

cheers,
Pádraig.

bug#11108: [PATCH] chmod: fix symlink race condition

2024-03-20 Thread Pádraig Brady


On 28/03/2012 21:28, Paul Eggert wrote:

On 03/28/2012 01:13 PM, Jim Meyering wrote:

 $ ./chmod u+w f
 ./chmod: changing permissions of 'f': Operation not supported


Yeouch.  I undid the change for now.
Hmm, why did "make check" work for me?
I'll have to investigate later, alas.


Patch for this pushed at:
https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=v9.4-163-g425b8a2f5

Marking this as done.

cheers,
Pádraig.

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-17 Thread Pádraig Brady

On 17/03/2024 11:32, Pádraig Brady wrote:

On 17/03/2024 06:10, Paul Eggert wrote:

On 2024-03-05 06:16, Pádraig Brady wrote:

I think I'll remove the as yet unreleased mv --swap from coreutils,
given that
util-linux is as widely available as coreutils on GNU/Linux platforms.

Although removing that "mv --swap" implementation was a win, I don't
think we can simply delegate this to util-linux's exch command.
Exchanging files via a renameat-like call is not limited to the Linux
kernel; it's also possible on macOS via renameatx_np with RENAME_SWAP,
and there have been noises about adding similar things to other
operating systems.

I just now added support for macOS renameatx_np to Gnulib, so coreutils
does not need to worry about the macOS details; it can simply use
renameatu with the Linux flags. See:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=af32ee824ee18255839f9812b8ed61aa5257a82b

Even with Linux it's dicey. People may have older util-linux installed
and so lack the 'exch' utility; this is true for both Fedora 39 and
Ubuntu 23.10, the current releases. Ubuntu is also odd in that it
doesn't install all the util-linux utilities as part of the util-linux
package, so it's not clear what they will do with 'exch'.

So I propose that we implement the idea in coreutils in a better way,
that interacts more nicely with -t, -T, etc. Also, I suggest using the
Linuxish name "--exchange" instead of the macOSish name "--swap", and
(for now at least) not giving the option a single-letter equivalent as I
expect it to be useful from scripts, not interactively.

After looking at various ways to do it I came up with the attached
proposed patch. This should work on both GNU/Linux and macOS, if your OS
is recent enough and the file system supports atomic exchange.

The implementation looks good.

Re exch(1) on macos, I see util-linux is on homebrew,
so it would be relatively easy to ifdef renameatx_np in util-linux also.

I think the --no-copy situation is brittle, as scripts not using it now
would be atomic, but then if we ever supported cross fs swaps
it may become non atomic. I'd at least doc with a line in the --exchange
description in usage() to say something like:
"Use --no-copy to enforce atomic operation"

While the most flexible, it's also quite awkward to need
`mv --exchange --no-copy --no-target-directory` for most uses.
I.e. it's tempting to imply the --no-... options with --exchange,
but I suppose since scripting is the primary use case for this
flexibility trumps conciseness, so I'm ok with the verbosity I think.

Oh also in the texinfo I think it's important to mention that the swap
will "exchange all data and metadata". That's not obvious otherwise.
For example users may be wondering if only data was being exchanged
with the macos exchangedata(2) or equivalent.

cheers,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-17 Thread Pádraig Brady

On 17/03/2024 06:10, Paul Eggert wrote:

On 2024-03-05 06:16, Pádraig Brady wrote:

I think I'll remove the as yet unreleased mv --swap from coreutils,
given that
util-linux is as widely available as coreutils on GNU/Linux platforms.

I just now added support for macOS renameatx_np to Gnulib, so coreutils
does not need to worry about the macOS details; it can simply use
renameatu with the Linux flags. See:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=af32ee824ee18255839f9812b8ed61aa5257a82b

The implementation looks good.

Re exch(1) on macos, I see util-linux is on homebrew,
so it would be relatively easy to ifdef renameatx_np in util-linux also.

thanks,
Pádraig

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-15 Thread Pádraig Brady


On 15/03/2024 05:21, Paul Eggert wrote:

On 2024-03-14 06:03, Pádraig Brady wrote:



How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
is used when the compiler supports -O1? That way we don't have to worry
about running the program, because (with the "volatile") clang will
error out.

Alternatively perhaps there's some way to check for the bug using
preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
__clang_major__, and __aarch64__. (This could be more fragile, though,
as clang presumably will fix the bug eventually.)


That would probably work for this edge case, but it's brittle
and I'm worried about other combinations of
compiler versions, complier options, and architectures.


Sure, but one cannot resolve such worries completely, as compiler errors
are inherently brittle: even a runtime test cannot prove their absence.

As I understand it, __bf16 is a real zoo: sometimes hardware supports
it, sometimes it doesn't, sometimes the software emulation library is
there, sometimes it's not, and so forth. Running a program on a
development host is likely to not match the runtime behavior on a
production host, so AC_RUN_IFELSE is reasonably likely to produce a
false positive, which is worse than a false negative.

Another issue, which is somewhat related, is that coreutils's current
macro BF16_SUPPORTED is too broad. Because __bf16 has so many bugs,
currently it's probably a mistake to say __bf16 is fully supported
anywhere. Even GCC is buggy; see for example:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114347

which I just now discovered when writing this email.


Nice one! (though that should not impact od)


The question isn't
so much whether __bf16 is supported, it's whether it's supported well
enough to compile and run GNU od. >

Come to think of it, for 'od' we might be better off avoiding __bf16
entirely, and simply converting the bits by hand into a 'float'
variable. This would likely be more portable, as it would work even with
compilers that lack __bf16, or that have __bf16 bugs relevant to 'od'.
And that way we could remove the configure-time test entirely. (There
would be endianness issues but they're easily addressible.)


Right.

My thinking though was that it was nice code wise to treat __bf16 like
_Float16 (and like float etc.).
More importantly I only see __bf16 support improving going forward,
so preferred to keep the code simpler (and the configure check robust).
Currently both the main compilers, and main architectures are supported
(when not cross-compiling), and tested quite robustly in the configure check.
Another thing that dissuaded me from implementing the conversions ourselves
is the plethora of floating point config options for compilers,
which made me think in general that floating point conversion
was best left to the compiler.

I've just now realized that AC_RUN_IFELSE needs an explicit default
for cross-compiling, and I've just pushed a conservative default
(which may default the other way in future when support broadens).

thanks,
Pádraig.

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-14 Thread Pádraig Brady


On 14/03/2024 13:35, Collin Funk wrote:

On 3/14/24 6:03 AM, Pádraig Brady wrote:

It would disable this feature for cross-compilation yes,
but this isn't the first instance of AC_RUN_IFELSE we use.


For completeness I should add that the above check can be
overridden if cross-compiling or whatever like:

  ./configure utils_cv_ieee_16_bit_supported=yes 
utils_cv_brain_16_bit_supported=yes


Sorry if this is not the proper place to ask, but would it be possible
to make Autoconf use an emulator when cross-compiling? This issue
would be *less* of a problem in that case.

I am more familiar with CMake where check_c_source_runs is used
performs a similar task [1]. It is essentially a wrapper function
around try_run [2]. CMake allows you to set
CMAKE_CROSSCOMPILING_EMULATOR which will make try_run use an emulator
to run programs instead [3].

I've used this with Wine and QEMU User a few times and it makes
CMake's configure tests work fine (though much slower). If this
functionality seems beneficial then I can look into adding it to
Autoconf.

[1] https://cmake.org/cmake/help/latest/module/CheckCSourceRuns.html
[2] 
https://github.com/Kitware/CMake/blob/4285dec5f012260337193f29f18899d6cf2536a4/Modules/Internal/CheckSourceRuns.cmake#L93
[3] 
https://cmake.org/cmake/help/latest/variable/CMAKE_CROSSCOMPILING_EMULATOR.html


Interesting.
Perhaps this is something that could be configured separately
on the build system, through something like binfmt_misc ?

cheers,
Pádraig

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-14 Thread Pádraig Brady


On 14/03/2024 05:59, Paul Eggert wrote:

On 2024-03-12 19:24, Grisha Levit wrote:

- AC_COMPILE_IFELSE(
+ AC_RUN_IFELSE(


This sort of change would break cross-compilation, no?


It would disable this feature for cross-compilation yes,
but this isn't the first instance of AC_RUN_IFELSE we use.


How about leaving it AC_COMPILE_IFELSE, but ensuring that -O1 or better
is used when the compiler supports -O1? That way we don't have to worry
about running the program, because (with the "volatile") clang will
error out.

Alternatively perhaps there's some way to check for the bug using
preprocessor macros like __FLT16_MANT_DIG__, __FLT16_MAX_EXP__,
__clang_major__, and __aarch64__. (This could be more fragile, though,
as clang presumably will fix the bug eventually.)


That would probably work for this edge case, but it's brittle
and I'm worried about other combinations of
compiler versions, complier options, and architectures.

For example sse availability is in the mix here,
so this will now correctly avoid this feature with:
  CFLAGS=-mno-sse ./configure --quiet
Whereas previously it would have built but produced wrong values.

cheers,
Pádraig.

bug#68708: [PATCH] env,kill: Handle unnamed signals

2024-03-13 Thread Pádraig Brady


On 25/01/2024 19:52, Grisha Levit wrote:

On Thu, Jan 25, 2024, 09:50 Pádraig Brady  wrote:
This mostly looks good, except:

- No need to clear the errno before kill(3).
- Better to use SIG%d rather than the bare %d for signal _names_, as we already 
parse this format


Makes sense, done below.

* src/operand2sig.c (operand2sig): Drop signame argument, accept all
signal numbers <= SIGNUM_BOUND.  All callers updated.
* src/env.c (parse_signal_action_params, reset_signal_handlers)
(parse_block_signal_params, set_signal_proc_mask)
(list_signal_handling): Accept all signal numbers <= SIGNUM_BOUND,
use SIG%d for printing if necessary.
* src/kill.c (list_signals, main): Likewise.
(send_signals): Check errno from kill(3) for bad signo.
* src/timeout.c (main): Update operand2sig call.
* tests/misc/kill.sh: Test listing all signal numbers.


I've made a few adjustments:

- Clarified the commit message.

- I've gone back to have kill -l produce a bare number for unnamed signals, 
because:
this is consistent with util-linux,
SIG%d is only parseable by coreutils (not util-linux or bash),
easier to programatically determine if a name is defined.
I've left as SIG%d for -t output as that's a coreutils specific option,
and not really programatically consumable anyway.

- I've added a validation check for `env --block=32` so that it fails
like `env --default=32` and `env --ignore=32`.
I.e. exits with EXIT_CANCELED.

- Added a NEWS entry.

I'll apply this later.

thanks!
PádraigFrom b8d1b00e21a86270fbbe5903a41319894db266df Mon Sep 17 00:00:00 2001
From: Grisha Levit 
Date: Thu, 25 Jan 2024 14:52:50 -0500
Subject: [PATCH] env,kill,timeout: support unnamed signals

Some signals with values less that the max signal number for the system
do not have defined names.  For example, currently on amd64 Linux,
signals 32 and 33 do not have defined names, and Android has a wider
gap of undefined names where it reserves some realtime signals.

Previously the signal listing in env ended up reusing the name
of the last printed valid signal (the repeated HUP below):

$ env --list-signal-handling true
HUP( 1): IGNORE
HUP(32): BLOCK
HUP(38): IGNORE

..and the corresponding signal numbers were rejected as operands for the
env, kill, and timeout commands.

This patch removes the requirement that sig2str returns 0 for a signal
number associated with an operand.  This allows unnamed signals to be in
the sets `env' attempts to manipulate when a --*-signal option is used
with no argument, and kill(1) and timeout(1) to send such unnamed
signals.

* src/operand2sig.c (operand2sig): Drop signame argument, accept all
signal numbers <= SIGNUM_BOUND.  All callers updated.
* src/env.c (parse_signal_action_params, reset_signal_handlers)
(parse_block_signal_params, set_signal_proc_mask)
(list_signal_handling): Accept all signal numbers <= SIGNUM_BOUND,
use SIG%d for printing if necessary.
* src/kill.c (list_signals, main): Likewise.
(send_signals): Check errno from kill(3) for bad signo.
* src/timeout.c (main): Update operand2sig call.
* tests/misc/kill.sh: Test listing all signal numbers.
* NEWS: Mention the improvement.
---
 NEWS   |  3 +++
 src/env.c  | 29 ++---
 src/kill.c | 24 
 src/operand2sig.c  |  8 
 src/operand2sig.h  |  2 +-
 src/timeout.c  |  3 +--
 tests/misc/kill.sh | 10 +-
 7 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/NEWS b/NEWS
index 20cadf183..f21efc7c0 100644
--- a/NEWS
+++ b/NEWS
@@ -92,6 +92,9 @@ GNU coreutils NEWS-*- outline -*-
   This was previously 128KiB and increasing to 256KiB was seen to increase
   throughput by 10-20% when reading cached files on modern systems.
 
+  env,kill,timeout now support unnamed signals. kill(1) for example now
+  supports sending such signals, and env(1) will list them appropriately.
+
   SELinux operations in file copy operations are now more efficient,
   avoiding unneeded MCS/MLS label translation.
 
diff --git a/src/env.c b/src/env.c
index 73d9847f4..ed6628f8f 100644
--- a/src/env.c
+++ b/src/env.c
@@ -538,7 +538,6 @@ parse_split_string (char const *str, int *orig_optind,
 static void
 parse_signal_action_params (char const *arg, bool set_default)
 {
-  char signame[SIG2STR_MAX];
   char *opt_sig;
   char *optarg_writable;
 
@@ -548,8 +547,7 @@ parse_signal_action_params (char const *arg, bool set_default)
  Some signals cannot be set to ignore or default (e.g., SIGKILL,
  SIGSTOP on most OSes, and SIGCONT on AIX.) - so ignore errors.  */
   for (int i = 1 ; i <= SIGNUM_BOUND; i++)
-if (sig2str (i, signame) == 0)
-  signals[i] = set_default ? DEFAULT_NOERR : IGNORE_NOERR;
+signals[i] = set_default ? DEFAULT_NOERR : IGNORE_NOERR;
   return;
 }
 
@@ -558,7 +556,7 @@ parse_signal_action_params (char

bug#69770: [PATCH] build: strengthen 16 bit float support checks

2024-03-13 Thread Pádraig Brady


On 13/03/2024 02:24, Grisha Levit wrote:

Recent clang provides __bf16 on aarch64 but it is broken.

If built with -O0, the conversion is wrong:

 $ printf '\x3F\x80' | od --end=big -An -tfB | tr -d ' '
 1.875

If built with -O1 or higher, compilation fails:

 fatal error: error in backend: Cannot select: 0xb47a58d29780: f32 = 
fp_extend 0xb47a58d31720
   0xb47a58d31720: bf16,ch = CopyFromReg 0xb47b78c53720, 
Register:bf16 %13
 0xb47a58d29470: bf16 = Register %13
 In function: print_bfloat

The latter issue does not cause the existing configure test to fail
because the promotion is optimized out.

* configure.ac: Ensure 16 bit float promotion code does not get
optimized out, and produces an expected result.


Looks good!

I'll follow up with another patch to all these involved checks to be cached / 
bypassed,
by wrapping them in AC_CACHE_VAL().

Marking this as done.

thanks!
Pádraig

bug#69724: [PATCH] dircolors: add more archive extensions

2024-03-11 Thread Pádraig Brady


I'm erring on the side of applying this,
though I'm a bit wary of an endlessly expanding list,
as there is no end to what can be compressed for example.

cheers,
Pádraig

bug#69636: Re: [PATCH] Improve quality of format-checking code for seq.

2024-03-08 Thread Pádraig Brady


tag 69636 notabug
close 69636
stop

On 08/03/2024 12:29, User wrote:


Jim Meyering  wrote:


Paul Eggert  wrote:

* src/seq.c (validate_format): Remove. Migrate its checks into...
(long_double_format): Report an error and exit if an error is found,
instead of returning NULL.  All callers changed.
Use a more-consistent format for diagnostics.
* tests/misc/seq: Adjust to the more-consistent format for diagnostics.
---
   src/seq.c  |   71

+++

   tests/misc/seq |   10 
   2 files changed, 25 insertions(+), 56 deletions(-)


Thanks.
As far as I can see this is a "no semantic change"
patch (modulo diagnostics), and it looks fine, so I pushed it.


Dear Paul Eggert and Jim Meyering,


Thank you very much for the contribution and care! I hope you have
awesome and cozy day!

It is a nice addition to check the format in the most and explicitly
safe cases, I believe, however it disallows to use it in cases when you
need to just repeat a character without ANY processing directive! For
example, `a=10; seq -f 'x' -s '' -- "$a";` would print 10 characters "x"
without the check added prior and significantly (or little but still)
support development in long codes which would require explicit loops or
additional dependencies like awesome Perl.

Is it possible to reconsider the check or add an option to ignore the
format which does not have a required directive? If you know a better
alternative to the standard utility which would result in the same small
code but great result, please suggest if possible!

Again, appreciated, and please stay safe!


This suggestion has some merit,
but there are a few ways to generate repetitive data already:

  a=10

  seq $a | xargs printf -- 'x%.0s'

  printf "%${a}s" '' | tr ' ' x

  yes x | head -n$a | tr -d '\n'

Also if only outputting to a terminal, then you can get the
repetition to happen within the terminal code like:

  printf -- "x\e[${a}b"

Therefore I'm not sure it's worth relaxing the validation in seq.

thanks,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-05 Thread Pádraig Brady


On 05/03/2024 04:10, Paul Eggert wrote:

On 3/4/24 16:43, Dominique Martinet wrote:

Adding Rob to the loop because this impacts compatibility with
toybox/maybe busybox implementations


Busybox does not use RENAME_EXCHANGE, so this isn't a Busybox issue.

Toybox mv added -x to its development version yesterday:

https://github.com/landley/toybox/commit/a2419ad52d489bf1a84a9f3aa73afb351642c765

so there's little prior art there, and there's still plenty of time to
fix its problems before exposing it to the world.



I also see --swap mostly used by scripts and this actually feels a bit
dangerous to me -- I'd *always* use this with -T.


Yes, it's a problem.

By "see --swap mostly used by scripts" I assume you mean scripts that
haven't been written yet, assuming that nobody had -x until yesterday



(by the way, what's this "rename" command you speak of?


https://mirrors.edge.kernel.org/pub/linux/utils/util-linux/

Now that I've looked into it further, util-linux already has an "exch"
command that does exactly what you want. This is the command that toybox
should implement rather than try to simulate it with "mv -x" (which
causes all sorts of problems).

That is, toybox should revert yesterday's change to "mv", and should
implement "exch" instead.


I think having the functionality in mv(1) is better than in rename(1),
but since exch(1) is already released that's probably
the best place for this functionality now.

A separate exch command may be overkill for just this,
but perhaps related functionality might be added to that command in future.
For e.g. some of the discussed functionality for a "replace" command
might reside there.

So I think I'll remove the as yet unreleased mv --swap from coreutils, given 
that
util-linux is as widely available as coreutils on GNU/Linux platforms.

cheers,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-04 Thread Pádraig Brady


On 04/03/2024 15:47, Pádraig Brady wrote:

On 04/03/2024 00:44, Paul Eggert wrote:

Although I like the idea of exposing file swaps to the user, the first
cut of 'mv -x' has significant problems.

I expect 'mv -x A B' to act like 'mv A B' except the destination must
exist and is renamed back to A. However, this is not true for 'mv -x A
B' when B is a directory; it renames B to A rather than renaming B/A to
A as I expect. That is, 'mv -x' acts as if -T (--no-target-directory) is
also specified. There is no way to get mv's traditional behavior, or to
get mv -t behavior.

To fix this, 'mv -x' should respect the usual mv behavior with respect
to directories. For example, when D is a directory 'mv -x A B C D'
should act like 'mv A B C D' except that D's old entries should be
renamed back to A B and C. And the -t and -T options should work with -x
the same way they work when -x is not specified.

This needs to happen before the next coreutils release, to avoid
confusion about 'mv -x'.


So you're saying the current mv -x behavior should only be with -T explicitly 
specified.
With the current implementation, we should at least document that -x implies -T.

By changing as you suggest, we'd not be adding any significant functionality,
but potentially reducing confusion by following mv traditional arg handling.
I agree with you I think, but not strongly.

It's worth stating though that if users want to swap 2 directories
they'd have to `mv -Tx d1 d2`, which may also be a little confusing.

The use case we don't currently support directly is:
cd source_dir && mv -x * "$dest_dir"
Instead that currently requires:
for f in *; do mv -x "$f" "$dest_dir"; done

For completeness, stating disadvantages of pulling this use case within mv,
is that by requiring the for loop for this would allow users
to programatically determine which swap failed, and I see --swap
as primarily a programatic interface used by scripts.
Also if we made this change, We'd have to document that `mv -x 1 2 ... d`
was not atomic over the whole set.

I will look at making the change before the upcoming release.


Another point worth mentioning before changing this,
is that changing would make the --swap operation non symmetric.
I.e. `mv -x a d` would be different to `mv -x d a` where d in a directory.

cheers,
Pádraig

p.s. Re dir swapping I noticed that specifying '/' or '.' dirs always gives 
EBUSY:
  $ mv -x . .
  mv: swap of '.' and '.' failed: Device or resource busy

bug#69546: cksum: inconsistent handling of invalid length values

2024-03-04 Thread Pádraig Brady


On 04/03/2024 15:44, Daniel Hofstetter wrote:

Hi,

When specifying an invalid length value followed by a valid length
value I get the following error:

$ printf "hello" | cksum --algo=blake2b --length=12 --length=8
cksum: invalid length: ‘12’
cksum: length is not a multiple of 8

However, if the invalid length value is a multiple of 8 and greater
than 512 (the maximum digest length for blake2b), there is no error:

$ printf "hello" | cksum --algo=blake2b --length=123456 --length=8
BLAKE2b-8 (-) = 29

I think the behavior should be the same in the two scenarios, whether
it's showing an error or ignoring the invalid value.

I'm using coreutils 9.4.


I pushed a fix at:
https://github.com/coreutils/coreutils/commit/fea833591

Now only the last used --length is validated.

Marking this as done.

cheers,
Pádraig

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-04 Thread Pádraig Brady


On 04/03/2024 00:44, Paul Eggert wrote:

Although I like the idea of exposing file swaps to the user, the first
cut of 'mv -x' has significant problems.

I expect 'mv -x A B' to act like 'mv A B' except the destination must
exist and is renamed back to A. However, this is not true for 'mv -x A
B' when B is a directory; it renames B to A rather than renaming B/A to
A as I expect. That is, 'mv -x' acts as if -T (--no-target-directory) is
also specified. There is no way to get mv's traditional behavior, or to
get mv -t behavior.

To fix this, 'mv -x' should respect the usual mv behavior with respect
to directories. For example, when D is a directory 'mv -x A B C D'
should act like 'mv A B C D' except that D's old entries should be
renamed back to A B and C. And the -t and -T options should work with -x
the same way they work when -x is not specified.

This needs to happen before the next coreutils release, to avoid
confusion about 'mv -x'.


So you're saying the current mv -x behavior should only be with -T explicitly 
specified.
With the current implementation, we should at least document that -x implies -T.

By changing as you suggest, we'd not be adding any significant functionality,
but potentially reducing confusion by following mv traditional arg handling.
I agree with you I think, but not strongly.

It's worth stating though that if users want to swap 2 directories
they'd have to `mv -Tx d1 d2`, which may also be a little confusing.

The use case we don't currently support directly is:
cd source_dir && mv -x * "$dest_dir"
Instead that currently requires:
for f in *; do mv -x "$f" "$dest_dir"; done

For completeness, stating disadvantages of pulling this use case within mv,
is that by requiring the for loop for this would allow users
to programatically determine which swap failed, and I see --swap
as primarily a programatic interface used by scripts.
Also if we made this change, We'd have to document that `mv -x 1 2 ... d`
was not atomic over the whole set.

I will look at making the change before the upcoming release.

cheers,
Pádraig

bug#69488: tr (question)

2024-03-01 Thread Pádraig Brady


On 01/03/2024 15:33, lacsaP Patatetom wrote:

hi,

I did a few tests with tr and I'm surprised by the results...

$ echo éèçà
éèçà

these characters are encoded in utf-8 on 2 bytes :

$ echo éèçà | xxd
: c3a9 c3a8 c3a7 c3a0 0a   .

now I use tr to remove non-printable characters :

$ echo éèçà | tr -cd '[:print:]'
$ echo éèçà | tr -cd '[:print:]' | wc
   0   0   0

all characters are deleted by tr
now I want to keep the "é" character :

$ echo éèçà | tr -cd '[:print:]é'
é���

why do the "�" characters appear ?

regards, lacsaP.



It's a known issue that tr is currently non multi-byte aware.

thanks,
Pádraig

bug#69418: test failure when no french locale is installed

2024-02-27 Thread Pádraig Brady


On 26/02/2024 23:05, Bruno Haible wrote:

Testing the current coreutils git master:

On a Debian 12 system, in which I have not installed a French UTF-8 locale,
I see a test failure of tests/misc/join-utf8.

The essential lines from test-suite.log:

+ test set = set
+ LC_ALL=none
../tests/misc/join-utf8.sh: line 24: warning: setlocale: LC_ALL: cannot change 
locale (none): No such file or directory

The cause is that on such a system, LOCALE_FR_UTF8, as determined by
gnulib/m4/locale-fr.m4, is 'none', not empty or absent.

The attached patch fixes the failure.



Applied.

This also indicates that gnulib ensures that LOCALE_FR_UTF8
is set to "none" when not present or usable,
so I've made that simplification in other tests now.

Marking this as done.

thanks!
Pádraig.

bug#69369: wc -w ignores breaking space over UCHAR_MAX

2024-02-25 Thread Pádraig Brady


On 24/02/2024 20:44, Aearil via GNU coreutils Bug Reports wrote:

Hi,

wc -w doesn't seem to recognize whitespace characters with a codepoint
over UCHAR_MAX (255) as word separators. For example, using the
character EM SPACE U+2003:

$ printf "foo\u2003bar" | ./wc -w
1

I should get a word count of 2, but instead the space is ignored while
counting words. Meanwhile, wc v9.4 gives the correct answer:

$ printf "foo\u2003bar" | wc -w
2

It looks like the regression has been introduced by [f40c6b5] and
would be fixed by something like the following change:

diff --git a/src/wc.c b/src/wc.c
index f5a921534..9d456f8c0 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -528,7 +528,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, 
off_t current_pos)
if (width > 0)
  linepos += width;
  }
-  in_word2 = !iswnbspace (wide_char);
+  in_word2 = !iswspace (wide_char) && !iswnbspace 
(wide_char);
  }

/* Count words by counting word starts, i.e., each


Nice one.
Great to catch this before release.
I've augmented your patch with a test,
and will push the attached later.

Marking this as done.

thanks!
Pádraig
From ced8c64c986b79c0bfa74028a9581e07d5df1974 Mon Sep 17 00:00:00 2001
From: Aearil 
Date: Sat, 24 Feb 2024 21:44:24 +0100
Subject: [PATCH] wc: fix -w with breaking space over UCHAR_MAX

* src/wc.c (wc): Fix regression introduced in commit v9.4-48-gf40c6b5cf.
* tests/wc/wc-nbsh.sh: Add test cases for "standard" spaces.
Fixes https://bugs.gnu.org/69369
---
 src/wc.c| 2 +-
 tests/wc/wc-nbsp.sh | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/wc.c b/src/wc.c
index f5a921534..9d456f8c0 100644
--- a/src/wc.c
+++ b/src/wc.c
@@ -528,7 +528,7 @@ wc (int fd, char const *file_x, struct fstatus *fstatus, off_t current_pos)
   if (width > 0)
 linepos += width;
 }
-  in_word2 = !iswnbspace (wide_char);
+  in_word2 = !iswspace (wide_char) && !iswnbspace (wide_char);
 }
 
   /* Count words by counting word starts, i.e., each
diff --git a/tests/wc/wc-nbsp.sh b/tests/wc/wc-nbsp.sh
index 371cc8b5b..39a8baccc 100755
--- a/tests/wc/wc-nbsp.sh
+++ b/tests/wc/wc-nbsp.sh
@@ -38,10 +38,15 @@ fi
 
 export LC_ALL=en_US.UTF-8
 if test "$(locale charmap 2>/dev/null)" = UTF-8; then
+  #non breaking space class
   check_word_sep '\u00A0'
   check_word_sep '\u2007'
   check_word_sep '\u202F'
   check_word_sep '\u2060'
+
+  #sampling of "standard" space class
+  check_word_sep '\u0020'
+  check_word_sep '\u2003'
 fi
 
 export LC_ALL=ru_RU.KOI8-R
-- 
2.43.0

bug#69368: [PATCH] Allow --zero with --check

2024-02-25 Thread Pádraig Brady


On 24/02/2024 19:07, Ricardo Branco wrote:

The --zero option is useless if not supported by --check.

Attached patch fixes it.

https://github.com/coreutils/coreutils/pull/81


I'm not sure about this.

By adding this support we diverge the checksum file formats supported by check.
I.e. users may inadvertently create such files that are not usable
by any previous version of the checksum utilities.

Also having --check support NUL delimited files
doesn't add any additional functionality, as the checksum files
are already automatically escaped appropriately.

Also NUL delimited checksum files can be harder to manage with other text tools.

--zero was added for the case of piping to other tools
that would find the filename escaping problematic.

cheers,
Pádraig

bug#62572: Bug#1058752: bug#62572: cp --no-clobber behavior has changed

2024-02-24 Thread Pádraig Brady


On 01/02/2024 00:36, Paul Eggert wrote:

On 1/31/24 06:06, Pádraig Brady wrote:

To my mind the most protective option takes precedence.


That's not how POSIX works with mv -i and mv -f. The last flag wins. I
assume this is so that people can have aliases or shell scripts that
make -i the default, but you can override by specifying -f on the
command line. E.g., in mymv:

 #!/bin/sh
 mv -i "$@"

then "mymv -f a b" works as expected.

Wouldn't a similar argument apply to cp's --update options?

Or perhaps we should play it safe, and reject any combination of
--update etc. options that are incompatible. We can always change our
mind later and say that later options override earlier ones, or do
something else that's less conservative.


OK I err'd on the side of changing as little as possible wrt precedence.
-n still has precedence over any -u,--update option.
That's simplest to understand (while not changing existing precedence),
and shouldn't cause any practical issues.

I plan to push the 2 attached patches for this tomorrow.

thanks,
PádraigFrom 51cf6f3ff272467bc9cde75c48d0250446be9b9c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Sat, 24 Feb 2024 19:51:56 +
Subject: [PATCH 1/2] cp,mv: reinstate that -n exits with success if files
 skipped

* src/cp.c (main): Adjust so that -n will exit success if skipped files.
* src/mv.c (main): Likewise.
* doc/coreutils.texi (cp invocation): Adjust the description of -n.
* src/system.h (emit_update_parameters_note): Adjust --update=none
comparison.
* tests/cp/cp-i.sh: Adjust -n exit status checks.
* tests/mv/mv-n.sh: Likewise.
* NEWS: Mention the change in behavior.
Fixes https://bugs.gnu.org/62572
---
 NEWS   |  4 
 doc/coreutils.texi | 17 +
 src/cp.c   | 14 --
 src/mv.c   | 10 ++
 src/system.h   |  4 ++--
 tests/cp/cp-i.sh   | 11 +--
 tests/mv/mv-n.sh   | 11 +--
 7 files changed, 39 insertions(+), 32 deletions(-)

diff --git a/NEWS b/NEWS
index 36b7fd1fe..a52b4cf66 100644
--- a/NEWS
+++ b/NEWS
@@ -43,6 +43,10 @@ GNU coreutils NEWS-*- outline -*-
   basenc --base16 -d now supports lower case hexadecimal characters.
   Previously an error was given for lower case hex digits.
 
+  cp --no-clobber, and mv -n no longer exit with failure status if
+  existing files are encountered in the destination.  Instead they revert
+  to the behavior from before v9.2, silently skipping existing files.
+
   ls --dired now implies long format output without hyperlinks enabled,
   and will take precedence over previously specified formats or hyperlink mode.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index c42126955..911e15b46 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9059,12 +9059,13 @@ a regular file in the destination tree.
 @itemx --no-clobber
 @opindex -n
 @opindex --no-clobber
-Do not overwrite an existing file; silently fail instead.
-This option overrides a previous
-@option{-i} option.  This option is mutually exclusive with @option{-b} or
-@option{--backup} option.
-See also the @option{--update=none} option which will
-skip existing files but not fail.
+Do not overwrite an existing file; silently skip instead.
+This option overrides a previous @option{-i} option.
+This option is mutually exclusive with @option{-b} or @option{--backup} option.
+This option is deprecated due to having a different exit status from
+other platforms.  See also the @option{--update} option which will
+give more control over how to deal with existing files in the destination,
+and over the exit status in particular.
 
 @item -P
 @itemx --no-dereference
@@ -9335,8 +9336,8 @@ This is the default operation when an @option{--update} option is not specified,
 and results in all existing files in the destination being replaced.
 
 @item none
-This is similar to the @option{--no-clobber} option, in that no files in the
-destination are replaced, but also skipping a file does not induce a failure.
+This is like the deprecated @option{--no-clobber} option, where no files in the
+destination are replaced, and also skipping a file does not induce a failure.
 
 @item older
 This is the default operation when @option{--update} is specified, and results
diff --git a/src/cp.c b/src/cp.c
index 0355ed97f..36ae4fb66 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -195,8 +195,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
   -L, --dereferencealways follow symbolic links in SOURCE\n\
 "), stdout);
   fputs (_("\
-  -n, --no-clobber ensure no existing files overwritten, and fail\n\
- silently instead. See also --update\n\
+  -n, --no-clobber silently skip existing files.\n\
+ See also --update\n\
 "), stdout);
   fputs (_("\
   -P, --no-de

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3393 matches

Mail list logo