bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-20 Thread Paul Eggert

On 3/20/24 15:53, Bernhard Voelker wrote:


   $ echo 1 > a
   $ mkdir d
   $ echo 2 > d/a
   $ src/mv -v --exchange a a a d
   renamed 'a' -> 'd/a'
   renamed 'a' -> 'd/a'
   renamed 'a' -> 'd/a'
   $ cat a
   2
   $ src/mv -v --exchange a a a d
   renamed 'a' -> 'd/a'
   renamed 'a' -> 'd/a'
   renamed 'a' -> 'd/a'
   $ cat a
   1
   $ src/mv -v --exchange a a a a d
   renamed 'a' -> 'd/a'
   renamed 'a' -> 'd/a'
   renamed 'a' -> 'd/a'
   renamed 'a' -> 'd/a'
   $ cat a
   1


Yes, that's the expected behavior for this contrived case. Just as one 
would get odd behavior if one did the same thing without --exchange.




I remember some implementation where mv(1) really was just a rename(2),
which failed when crossing file systems.  Was it some HP-UX or Solaris mv(1)?


I doubt it. Even 7th Edition 'mv' (1979) fell back on 'cp' when the link 
syscall failed (this was before 'rename' existed).




My point is that "exchange" is a different functionality.


Yes, but it's closely related. Arguably --backup is also a different 
functionality too (and arguably --exchange is simply an alternative 
backup scheme!) but 'mv' has --backup.




- How large is the useful overlap with the existing code of mv(1)?
   Not much: no traditional rename nor copy.


I don't follow this point. The code change was fairly small, which 
indicates there was a lot of overlap with existing functionality.




- How large is the useful overlap with the existing options/modes of mv(1)?
   - exchange contradicts --backup,


That could be fixed for regular files, if there's a need, by backing up 
the destination via 'link' before exchanging. For directories it's 
admittedly a problem, but that's also the case for plain 'mv' (or for 
'cp' or 'ln', for that matter) so there's not much new here.




   - exchange is not useful together with options working with a regular
     rename of copy, at least: --update, -Z, -n.


It should work with --update and -Z. -n of course is logically 
incompatible, but this not the only set of logically incompatible 
options (e.g., -t vs -T).




   - not sure if exchange works well together with -f.


What problems do you see there?



why does exchange not work to exchange a regular with a 
directory file?


It works. I don't see a problem there.

  $ touch a
  $ mkdir d
  $ ./mv -T --exchange a d
  $ ls -ld a d
  drwxr-xr-x. 2 eggert eggert 4096 Mar 20 16:52 a
  -rw-r--r--. 1 eggert eggert0 Mar 20 16:52 d



Finally, the test cases are very sparse:


Feel free to add some. :-)





bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-20 Thread Rob Landley
On 3/20/24 14:43, Bernhard Voelker wrote:
> On 3/17/24 07:10, Paul Eggert wrote:
> Now, extending "exchange" to more arguments is confusing and the
> use is not intuitive:
>mv -v --exchange  a b c d

It's also pointless. An atomic exchange on more than 2 files ISN'T ATOMIC.
That's why I didn't do it.

You already had "mv -T" requiring exactly two arguments, so thinking mv -x has
cooties because it works the same way is just weird.

> I have the gut feeling that we didn't think through all cases,

Sounds like. Having mv modify its source directory during recursive descent is
creepy.

Toybox implemented:

-x  Atomically exchange source/dest (--swap)

Which behaves like:

  $ ./mv -x one two
  $ ./mv -x one two three
  mv: -x needs 2 args

My change from this discussion was adding the "--swap" synonym you wanted.

Rob





Re: env: follow up on argv0 setting feature

2024-03-20 Thread Pádraig Brady

On 13/03/2024 10:19, Matheus Afonso Martins Moreira wrote:

About a year ago, I posted an env feature request on this list:
the ability to set the value of argv[0].

https://lists.gnu.org/archive/html/coreutils/2023-03/msg2.html

I also sent a patch:

https://lists.gnu.org/archive/html/coreutils/2023-03/msg3.html

After some discussion, I was informed it was under consideration:

https://lists.gnu.org/archive/html/coreutils/2023-03/msg00012.html


We're still considering and will adjust as needed.


I waited a while and eventually sent an email about it:

https://lists.gnu.org/archive/html/coreutils/2023-08/msg00059.html

The maintainer noted that they expected this feature to be included
in the next release which would be focused on features:

https://lists.gnu.org/archive/html/coreutils/2023-08/msg00060.html


The next release will focus on new features, and this
will be considered. I expect this feature will be included.


Since then I've been tracking commits to the coreutils master branch
but it appears the feature has not landed yet.

What happened? Was it rejected?



Thanks for your patience.
I've attached an implementation for --argv0
which I intend to apply for the impending release.
Note this can accept empty and NULL values,
and so now gives env full control of the args it passes on.

thanks,
Pádraig.From a2f4ef43c67450d13a12bc63c72f16c690670dfa Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= 
Date: Thu, 2 Mar 2023 11:56:18 -0300
Subject: [PATCH] env: add --argv0 to set the first argument passed to exec

Using the shell's exec -a feature can be awkward
so add support for setting argv0 to arbitrary values.
This gives env full control over the arguments it passes.

* src/env.c: Accept --argv0 and set argv[0] appropriately.
* tests/env/env.sh: Add test cases.
* doc/coreutils.texi (env invocation): Describe --argv0.
* NEWS: Mention the new feature.
---
 NEWS   |  3 +++
 doc/coreutils.texi |  8 
 src/env.c  | 31 +++
 tests/env/env.sh   | 26 ++
 4 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index b3004273b..839582c16 100644
--- a/NEWS
+++ b/NEWS
@@ -92,6 +92,9 @@ GNU coreutils NEWS-*- outline -*-
   and the command exits with failure status if existing files.
   The -n,--no-clobber option is best avoided due to platform differences.
 
+  env now accepts the --argv0 option to override the zeroth argument
+  of the command being executed.
+
   od now supports printing IEEE half precision floating point with -t fH,
   or brain 16 bit floating point with -t fB, where supported by the compiler.
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index ec15a467f..766925623 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -17927,6 +17927,14 @@ Options must precede operands.
 
 @optNull
 
+@item --argv0[=@var{arg}]
+@opindex --argv0
+By default @var{command} is used as the zeroth argument passed to
+the command being executed, which can be overridden with this option.
+@var{arg} is optional, and set to the NULL is not specified.
+For example, @samp{--argv0=} would pass the empty string,
+while @samp{--argv0} would pass a NULL pointer.
+
 @item -u @var{name}
 @itemx --unset=@var{name}
 @opindex -u
diff --git a/src/env.c b/src/env.c
index ed6628f8f..dc08b2dba 100644
--- a/src/env.c
+++ b/src/env.c
@@ -79,7 +79,8 @@ static char const shortopts[] = "+C:iS:u:v0" C_ISSPACE_CHARS;
non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
 enum
 {
-  DEFAULT_SIGNAL_OPTION = CHAR_MAX + 1,
+  ARGV0_OPTION = CHAR_MAX + 1,
+  DEFAULT_SIGNAL_OPTION,
   IGNORE_SIGNAL_OPTION,
   BLOCK_SIGNAL_OPTION,
   LIST_SIGNAL_HANDLING_OPTION,
@@ -87,6 +88,7 @@ enum
 
 static struct option const longopts[] =
 {
+  {"argv0", optional_argument, nullptr, ARGV0_OPTION},
   {"ignore-environment", no_argument, nullptr, 'i'},
   {"null", no_argument, nullptr, '0'},
   {"unset", required_argument, nullptr, 'u'},
@@ -119,6 +121,9 @@ Set each NAME to VALUE in the environment and run COMMAND.\n\
   emit_mandatory_arg_note ();
 
   fputs (_("\
+  --argv0[=ARG]pass ARG as the zeroth argument of COMMAND\n\
+"), stdout);
+  fputs (_("\
   -i, --ignore-environment  start with an empty environment\n\
   -0, --null   end each output line with NUL, not newline\n\
   -u, --unset=NAME remove variable from the environment\n\
@@ -759,6 +764,7 @@ main (int argc, char **argv)
   bool ignore_environment = false;
   bool opt_nul_terminate_output = false;
   char const *newdir = nullptr;
+  char *argv0 = *argv;  /* Initialize to a distinct but unused value.  */
 
   initialize_main (, );
   set_program_name (argv[0]);
@@ -775,6 +781,9 @@ main (int argc, char **argv)
 {
   switch (optc)
 {
+case ARGV0_OPTION:
+  argv0 = optarg;
+  break;
 case 'i':
   ignore_environment = true;

bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-20 Thread Bernhard Voelker

On 3/20/24 21:56, Paul Eggert wrote:

On 3/20/24 12:43, Bernhard Voelker wrote:


This stems from the fact that although mv(1) is a userland frontend
for renameat(2), the user interface is different:
while renameat(2) deals exactly with 2 operands, mv(1) has always
been able to work on more arguments.


Yes, that's mv's original sin, which we cannot realistically change now.


I wouldn't go that far that it was a sin.  It's useful and people got
used to it without having to think about it.


I have the gut feeling that we didn't think through all cases,
and that some might be surprising, e.g.:

    $ mkdir d; echo 1 > a; echo 2 > d/a
    $ src/mv --exchange a a a a d/a

versus

    $ src/mv --exchange a a a a d/a


I don't understand the word "versus" here, as the two examples look the
same to me.


sorry, I messed the example up.

  $ echo 1 > a
  $ mkdir d
  $ echo 2 > d/a
  $ src/mv -v --exchange a a a d
  renamed 'a' -> 'd/a'
  renamed 'a' -> 'd/a'
  renamed 'a' -> 'd/a'
  $ cat a
  2
  $ src/mv -v --exchange a a a d
  renamed 'a' -> 'd/a'
  renamed 'a' -> 'd/a'
  renamed 'a' -> 'd/a'
  $ cat a
  1
  $ src/mv -v --exchange a a a a d
  renamed 'a' -> 'd/a'
  renamed 'a' -> 'd/a'
  renamed 'a' -> 'd/a'
  renamed 'a' -> 'd/a'
  $ cat a
  1

I remember some implementation where mv(1) really was just a rename(2),
which failed when crossing file systems.  Was it some HP-UX or Solaris mv(1)?
mv(1) has learned to copy+delete over time, which is what people would
expect from a "move".

My point is that "exchange" is a different functionality.
It's well somehow belonging and related to what renameat(2) is doing in the 
kernel,
and therefore it comes in handy that we can simply call it with an additional 
flag.
Yet it's IMO a different operation.  I bet there had been discussions whether
to create a new syscall, but apparently it was easier to put it with a flag
into an existing one.  Fine for the kernel.

On userland OTOH, we have broader choice.
Karel did his choice in util-linux for exch(1), and coreutils could expose
the same functionality.

For other feature requests, we were much more reluctant in coreutils ... for
good reasons: feature bloat, maintainability, etc.

So I'm asking myself what is different this time?
- The feature already exists -> util-linux.
- Okay, we're using the same syscall, renameat(2) -> it's tempting.
- How large is the useful overlap with the existing code of mv(1)?
  Not much: no traditional rename nor copy.
- How large is the useful overlap with the existing options/modes of mv(1)?
  - exchange contradicts --backup,
  - exchange is not useful together with options working with a regular
rename of copy, at least: --update, -Z, -n.
  - not sure if exchange works well together with -f.

I'm currently only 20:80 for adding it to mv(1).
The functionality is cool, but do we need to press it into mv(1) with so many
incompatibilities just because it's requiring renameat(2) we already use?
Maybe to consider: One tool for one thing ... means another tool for another 
thing.

Again, I have the gut feeling that we've missed some cases to think about.
And once the feature would be in ...

Furthermore, why does exchange not work to exchange a regular with a directory 
file?
We've all learned that everything's a file, so it cannot be explained to users 
that
exchanging a regular file with a directory doesn't work.

Finally, the test cases are very sparse: no cases with different owners, 
different
directory permissions, different file types (if we know already f<->d doesn't 
work),
triggering races, etc.

I don't really want to object to add it, but I find it quite odd as of today.

Have a nice day,
Berny





bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-20 Thread Paul Eggert

On 3/17/24 04:32, Pádraig Brady wrote:

I think the --no-copy situation is brittle, as scripts not using it now
would be atomic, but then if we ever supported cross fs swaps
it may become non atomic. I'd at least doc with a line in the --exchange
description in usage() to say something like:
   "Use --no-copy to enforce atomic operation"


But --no-copy doesn't mean atomic operation; it simply means don't copy.

On systems that don't have an atomic exchange, we can emulate "mv 
--exchange --no-copy a b" with three calls link("b", "b.tmp"); 
rename("a","b"); rename("b.tmp","a"). This wouldn't copy, but it 
wouldn't be atomic.


Although atomicity is important, currently the coreutils documentation 
doesn't cover it and the code doesn't always handle it well. For 
example, if A and B are regular files "mv -b A B" briefly has a moment 
when B stops existing. To my mind this is a bug: an existing destination 
shouldn't stop existing merely because you're replacing it. At some 
point this stuff should be documented better and this (and probably some 
other) atomicity bugs fixed.


One thought I had while looking into this was that we could add an 
--atomic option to mv and ln, such that the command fails if the 
destination cannot be updated atomically. That would be a stronger 
option than --no-copy. (In some cases we could support --atomic even for 
'cp', no?)


Anyway, for now I installed the patch with some minor changes to the 
documentation's --exchange section to try to document this tricky area 
more clearly. Here's the revised doc section. It also incorporates your 
later suggestion to mention both data and metadata.




@item --exchange
@opindex --exchange
Exchange source and destination instead of renaming source to destination.
Both files must exist; they need not be the same type.
This exchanges all data and metadata.

This option can be used to replace one directory with another.
When used this way, it should be combined with
@code{--no-target-directory} (@option{-T})
to avoid confusion about the destination location.
For example, you might use @samp{mv -T --exchange @var{d1} @var{d2}}
to exchange two directories @var{d1} and @var{d2}.

Exchanges are atomic if the source and destination are both in a
single file system that supports atomic exchange.
Non-atomic exchanges are not yet supported.

If the source and destination might not be on the same file system,
using @code{--no-copy} will prevent future versions of @command{mv}
from implementing the exchange by copying.






Re: Symlink flag for chmod

2024-03-20 Thread Kaz Kylheku
On 2024-03-19 16:53, Pádraig Brady wrote:
> but would appreciate a quick review of the fts_level usage in the second 
> patch.
> 
> thanks,
> Pádraig

In utilities that perform tree walks, there may be reasons why
not following links is a useful option, but it can't be relied on
for security.

Suppose Alice wants to traverse path /a/b/c where malicious Mallory is in
control of component b.

If Alice checks whether b is a symlink and refuses to follow it, she
suffers from a TOCtoTOU race. The b component can be a directory at the time
Alice checks, but just before Alice subsequently traverses it, Mallory
swaps it for a symlink.

(Sure, not following symlinks will defend against a passive attack,
whereby a user just plants a malicious symlink in a directory they
control and then waits.)

To do this correctly, we need a path-resolution function that carefully
avoids following any insecure/untrusted path component. (This could be
built into a secure version of nftw, as an optional behavior; nftw
itself should have a flag to not follow unsafe links, implemented with
the utmost security.) For this, open openat() with O_NOFOLLOW | O_PATH
can be used to go from component to component.

Nearly two years ago, I experimented with a solution that doesn't use
openat, but only classic POSIX calls, which I call safepath:

https://www.kylheku.com/cgit/safepath/

The basic idea is that we can traverse the path left to right, and
perform a security check at each step: "does a user other than me
control this directory, such that the next path component could go
anywhere at any time?" If the answer is no, the resolution can proceed
to the next component. Under this view, symlinks can be safe. E.g.
root can follow a symlink that only root can modify.




bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-20 Thread Paul Eggert

On 3/20/24 12:43, Bernhard Voelker wrote:


This stems from the fact that although mv(1) is a userland frontend
for renameat(2), the user interface is different:
while renameat(2) deals exactly with 2 operands, mv(1) has always
been able to work on more arguments.


Yes, that's mv's original sin, which we cannot realistically change now.



Now, extending "exchange" to more arguments is confusing and the
use is not intuitive:
   mv -v --exchange  a b c d

An "exchange" can literally only be applied to 2 files,


Sure, but that's true for "rename" too: a "rename" can be applied only 
to 2 files.


When d is a directory, "mv a b c d" does three renames so it is like "mv 
a d/a; mv b d/b; mv c d/c". This remains true if you uniformly replace 
"mv" with "mv --exchange", which does three exchanges.




I have the gut feeling that we didn't think through all cases,
and that some might be surprising, e.g.:

   $ mkdir d; echo 1 > a; echo 2 > d/a
   $ src/mv --exchange a a a a d/a

versus

   $ src/mv --exchange a a a a d/a


I don't understand the word "versus" here, as the two examples look the 
same to me.


If d/a is not a directory, the example is an error, just as it would be 
without --exchange.


If d/a is a directory and you have permissions etc., "mv a a a a d/a" is 
like attempting "mv -T a d/a/a; mv -T a d/a/a; mv -T a d/a/a; mv -T a 
d/a/a". If you use plain "mv" only the first "mv -T a d/a/a" succeeds 
because "a" goes away, so you get three diagnostics for the remaining 
three "a"s. If you use "mv --exchange" all four "mv --exchange -T a 
d/a/a" attempts succeed, and since there are an even number of exchanges 
the end result is a no-op except for updated directory timestamps. So I 
don't see any ambiguity about what mv should do with this contrived example.







bug#69532: mv's new -x option should be made orthogonal to -t/-T/default

2024-03-20 Thread Bernhard Voelker

On 3/17/24 07:10, Paul Eggert wrote:

Although removing that "mv --swap" implementation was a win, I don't
think we can simply delegate this to util-linux's exch command.


I still have some headache adding this.

This stems from the fact that although mv(1) is a userland frontend
for renameat(2), the user interface is different:
while renameat(2) deals exactly with 2 operands, mv(1) has always
been able to work on more arguments.

Now, extending "exchange" to more arguments is confusing and the
use is not intuitive:
  mv -v --exchange  a b c d

An "exchange" can literally only be applied to 2 files,
and 'exch' is IMO fine.

I have the gut feeling that we didn't think through all cases,
and that some might be surprising, e.g.:

  $ mkdir d; echo 1 > a; echo 2 > d/a
  $ src/mv --exchange a a a a d/a

versus

  $ src/mv --exchange a a a a d/a

Have a nice day,
Berny






bug#10311: RFE: Give chmod a "-h" option as well

2024-03-20 Thread Pádraig Brady

On 16/12/2011 16:29, Jan Engelhardt wrote:

Hi,

chown(1) has a -h option by which it affects symlinks directly rather
than the pointed-to file. The bonus side effect is that the
pointed-to files don't get changed in any way, which is kinda welcome
if you attempt to "fix" permissions/ownership in a directory where an
evil user could create a symlink to e.g. /etc/shadow.

Attempting chmod -R g+w /home/groups/evilgroup is still a risk, and
would necessity a more long-winded command involving find(1). It
would therefore be welcome that chmod receive an -h option that just
skips over them (besides perhaps attempting to change their
permissions as well).


Pushed at
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v9.4-162-g07a69fc3b

Marking as done.

cheers,
Pádraig.





bug#11108: [PATCH] chmod: fix symlink race condition

2024-03-20 Thread Pádraig Brady

On 28/03/2012 21:28, Paul Eggert wrote:

On 03/28/2012 01:13 PM, Jim Meyering wrote:

 $ ./chmod u+w f
 ./chmod: changing permissions of 'f': Operation not supported


Yeouch.  I undid the change for now.
Hmm, why did "make check" work for me?
I'll have to investigate later, alas.


Patch for this pushed at:
https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=v9.4-163-g425b8a2f5

Marking this as done.

cheers,
Pádraig.





Re: RFE: enable buffering on null-terminated data

2024-03-20 Thread Carl Edquist via GNU coreutils General Discussion



On Tue, 19 Mar 2024, Zachary Santer wrote:


On Tue, Mar 19, 2024 at 1:24 AM Kaz Kylheku  wrote:


But what tee does is set up _IONBF on its output streams,
including stdout.


So it doesn't buffer at all. Awesome. Nevermind.


Yay!  :D

And since tee uses fwrite to copy whatever input is available, that will 
mean 'records' are output on the same boundaries as the input (whether 
that be newlines, nuls, or just block boundaries).  So putting tee in the 
middle of a pipeline shouldn't itself interfere with whatever else you're 
up to.  (AND it's still relatively efficient, compared to some tools like 
cut that putchar a byte at a time.)


My note about pipelines like this though:

$ ./build.sh | sed s/what/ever/ | tee build.log

is that with the default stdio buffering, while all the commands in 
build.sh will be implicitly self-flushing, the sed in the middle will end 
up batching its output into blocks, so tee will also repeat them in 
blocks.


However, if stdbuf's magic env vars are exported in your shell (either by 
doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply 
by first starting a new shell with 'stdbuf -oL bash'), then every command 
in your pipelines will start with the new default line-buffered stdout. 
That way your line-items from build.sh should get passed all the way 
through the pipeline as they are produced.



(But, proof's in the pudding, so whatever works for you :D )


Happy putting all the way!

Carl