bug#69532: mv's new -x option should be made orthogonal to -t/-T/default
On 3/20/24 15:53, Bernhard Voelker wrote: $ echo 1 > a $ mkdir d $ echo 2 > d/a $ src/mv -v --exchange a a a d renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' $ cat a 2 $ src/mv -v --exchange a a a d renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' $ cat a 1 $ src/mv -v --exchange a a a a d renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' $ cat a 1 Yes, that's the expected behavior for this contrived case. Just as one would get odd behavior if one did the same thing without --exchange. I remember some implementation where mv(1) really was just a rename(2), which failed when crossing file systems. Was it some HP-UX or Solaris mv(1)? I doubt it. Even 7th Edition 'mv' (1979) fell back on 'cp' when the link syscall failed (this was before 'rename' existed). My point is that "exchange" is a different functionality. Yes, but it's closely related. Arguably --backup is also a different functionality too (and arguably --exchange is simply an alternative backup scheme!) but 'mv' has --backup. - How large is the useful overlap with the existing code of mv(1)? Not much: no traditional rename nor copy. I don't follow this point. The code change was fairly small, which indicates there was a lot of overlap with existing functionality. - How large is the useful overlap with the existing options/modes of mv(1)? - exchange contradicts --backup, That could be fixed for regular files, if there's a need, by backing up the destination via 'link' before exchanging. For directories it's admittedly a problem, but that's also the case for plain 'mv' (or for 'cp' or 'ln', for that matter) so there's not much new here. - exchange is not useful together with options working with a regular rename of copy, at least: --update, -Z, -n. It should work with --update and -Z. -n of course is logically incompatible, but this not the only set of logically incompatible options (e.g., -t vs -T). - not sure if exchange works well together with -f. What problems do you see there? why does exchange not work to exchange a regular with a directory file? It works. I don't see a problem there. $ touch a $ mkdir d $ ./mv -T --exchange a d $ ls -ld a d drwxr-xr-x. 2 eggert eggert 4096 Mar 20 16:52 a -rw-r--r--. 1 eggert eggert0 Mar 20 16:52 d Finally, the test cases are very sparse: Feel free to add some. :-)
bug#69532: mv's new -x option should be made orthogonal to -t/-T/default
On 3/20/24 14:43, Bernhard Voelker wrote: > On 3/17/24 07:10, Paul Eggert wrote: > Now, extending "exchange" to more arguments is confusing and the > use is not intuitive: >mv -v --exchange a b c d It's also pointless. An atomic exchange on more than 2 files ISN'T ATOMIC. That's why I didn't do it. You already had "mv -T" requiring exactly two arguments, so thinking mv -x has cooties because it works the same way is just weird. > I have the gut feeling that we didn't think through all cases, Sounds like. Having mv modify its source directory during recursive descent is creepy. Toybox implemented: -x Atomically exchange source/dest (--swap) Which behaves like: $ ./mv -x one two $ ./mv -x one two three mv: -x needs 2 args My change from this discussion was adding the "--swap" synonym you wanted. Rob
Re: env: follow up on argv0 setting feature
On 13/03/2024 10:19, Matheus Afonso Martins Moreira wrote: About a year ago, I posted an env feature request on this list: the ability to set the value of argv[0]. https://lists.gnu.org/archive/html/coreutils/2023-03/msg2.html I also sent a patch: https://lists.gnu.org/archive/html/coreutils/2023-03/msg3.html After some discussion, I was informed it was under consideration: https://lists.gnu.org/archive/html/coreutils/2023-03/msg00012.html We're still considering and will adjust as needed. I waited a while and eventually sent an email about it: https://lists.gnu.org/archive/html/coreutils/2023-08/msg00059.html The maintainer noted that they expected this feature to be included in the next release which would be focused on features: https://lists.gnu.org/archive/html/coreutils/2023-08/msg00060.html The next release will focus on new features, and this will be considered. I expect this feature will be included. Since then I've been tracking commits to the coreutils master branch but it appears the feature has not landed yet. What happened? Was it rejected? Thanks for your patience. I've attached an implementation for --argv0 which I intend to apply for the impending release. Note this can accept empty and NULL values, and so now gives env full control of the args it passes on. thanks, Pádraig.From a2f4ef43c67450d13a12bc63c72f16c690670dfa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Thu, 2 Mar 2023 11:56:18 -0300 Subject: [PATCH] env: add --argv0 to set the first argument passed to exec Using the shell's exec -a feature can be awkward so add support for setting argv0 to arbitrary values. This gives env full control over the arguments it passes. * src/env.c: Accept --argv0 and set argv[0] appropriately. * tests/env/env.sh: Add test cases. * doc/coreutils.texi (env invocation): Describe --argv0. * NEWS: Mention the new feature. --- NEWS | 3 +++ doc/coreutils.texi | 8 src/env.c | 31 +++ tests/env/env.sh | 26 ++ 4 files changed, 64 insertions(+), 4 deletions(-) diff --git a/NEWS b/NEWS index b3004273b..839582c16 100644 --- a/NEWS +++ b/NEWS @@ -92,6 +92,9 @@ GNU coreutils NEWS-*- outline -*- and the command exits with failure status if existing files. The -n,--no-clobber option is best avoided due to platform differences. + env now accepts the --argv0 option to override the zeroth argument + of the command being executed. + od now supports printing IEEE half precision floating point with -t fH, or brain 16 bit floating point with -t fB, where supported by the compiler. diff --git a/doc/coreutils.texi b/doc/coreutils.texi index ec15a467f..766925623 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -17927,6 +17927,14 @@ Options must precede operands. @optNull +@item --argv0[=@var{arg}] +@opindex --argv0 +By default @var{command} is used as the zeroth argument passed to +the command being executed, which can be overridden with this option. +@var{arg} is optional, and set to the NULL is not specified. +For example, @samp{--argv0=} would pass the empty string, +while @samp{--argv0} would pass a NULL pointer. + @item -u @var{name} @itemx --unset=@var{name} @opindex -u diff --git a/src/env.c b/src/env.c index ed6628f8f..dc08b2dba 100644 --- a/src/env.c +++ b/src/env.c @@ -79,7 +79,8 @@ static char const shortopts[] = "+C:iS:u:v0" C_ISSPACE_CHARS; non-character as a pseudo short option, starting with CHAR_MAX + 1. */ enum { - DEFAULT_SIGNAL_OPTION = CHAR_MAX + 1, + ARGV0_OPTION = CHAR_MAX + 1, + DEFAULT_SIGNAL_OPTION, IGNORE_SIGNAL_OPTION, BLOCK_SIGNAL_OPTION, LIST_SIGNAL_HANDLING_OPTION, @@ -87,6 +88,7 @@ enum static struct option const longopts[] = { + {"argv0", optional_argument, nullptr, ARGV0_OPTION}, {"ignore-environment", no_argument, nullptr, 'i'}, {"null", no_argument, nullptr, '0'}, {"unset", required_argument, nullptr, 'u'}, @@ -119,6 +121,9 @@ Set each NAME to VALUE in the environment and run COMMAND.\n\ emit_mandatory_arg_note (); fputs (_("\ + --argv0[=ARG]pass ARG as the zeroth argument of COMMAND\n\ +"), stdout); + fputs (_("\ -i, --ignore-environment start with an empty environment\n\ -0, --null end each output line with NUL, not newline\n\ -u, --unset=NAME remove variable from the environment\n\ @@ -759,6 +764,7 @@ main (int argc, char **argv) bool ignore_environment = false; bool opt_nul_terminate_output = false; char const *newdir = nullptr; + char *argv0 = *argv; /* Initialize to a distinct but unused value. */ initialize_main (, ); set_program_name (argv[0]); @@ -775,6 +781,9 @@ main (int argc, char **argv) { switch (optc) { +case ARGV0_OPTION: + argv0 = optarg; + break; case 'i': ignore_environment = true;
bug#69532: mv's new -x option should be made orthogonal to -t/-T/default
On 3/20/24 21:56, Paul Eggert wrote: On 3/20/24 12:43, Bernhard Voelker wrote: This stems from the fact that although mv(1) is a userland frontend for renameat(2), the user interface is different: while renameat(2) deals exactly with 2 operands, mv(1) has always been able to work on more arguments. Yes, that's mv's original sin, which we cannot realistically change now. I wouldn't go that far that it was a sin. It's useful and people got used to it without having to think about it. I have the gut feeling that we didn't think through all cases, and that some might be surprising, e.g.: $ mkdir d; echo 1 > a; echo 2 > d/a $ src/mv --exchange a a a a d/a versus $ src/mv --exchange a a a a d/a I don't understand the word "versus" here, as the two examples look the same to me. sorry, I messed the example up. $ echo 1 > a $ mkdir d $ echo 2 > d/a $ src/mv -v --exchange a a a d renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' $ cat a 2 $ src/mv -v --exchange a a a d renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' $ cat a 1 $ src/mv -v --exchange a a a a d renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' renamed 'a' -> 'd/a' $ cat a 1 I remember some implementation where mv(1) really was just a rename(2), which failed when crossing file systems. Was it some HP-UX or Solaris mv(1)? mv(1) has learned to copy+delete over time, which is what people would expect from a "move". My point is that "exchange" is a different functionality. It's well somehow belonging and related to what renameat(2) is doing in the kernel, and therefore it comes in handy that we can simply call it with an additional flag. Yet it's IMO a different operation. I bet there had been discussions whether to create a new syscall, but apparently it was easier to put it with a flag into an existing one. Fine for the kernel. On userland OTOH, we have broader choice. Karel did his choice in util-linux for exch(1), and coreutils could expose the same functionality. For other feature requests, we were much more reluctant in coreutils ... for good reasons: feature bloat, maintainability, etc. So I'm asking myself what is different this time? - The feature already exists -> util-linux. - Okay, we're using the same syscall, renameat(2) -> it's tempting. - How large is the useful overlap with the existing code of mv(1)? Not much: no traditional rename nor copy. - How large is the useful overlap with the existing options/modes of mv(1)? - exchange contradicts --backup, - exchange is not useful together with options working with a regular rename of copy, at least: --update, -Z, -n. - not sure if exchange works well together with -f. I'm currently only 20:80 for adding it to mv(1). The functionality is cool, but do we need to press it into mv(1) with so many incompatibilities just because it's requiring renameat(2) we already use? Maybe to consider: One tool for one thing ... means another tool for another thing. Again, I have the gut feeling that we've missed some cases to think about. And once the feature would be in ... Furthermore, why does exchange not work to exchange a regular with a directory file? We've all learned that everything's a file, so it cannot be explained to users that exchanging a regular file with a directory doesn't work. Finally, the test cases are very sparse: no cases with different owners, different directory permissions, different file types (if we know already f<->d doesn't work), triggering races, etc. I don't really want to object to add it, but I find it quite odd as of today. Have a nice day, Berny
bug#69532: mv's new -x option should be made orthogonal to -t/-T/default
On 3/17/24 04:32, Pádraig Brady wrote: I think the --no-copy situation is brittle, as scripts not using it now would be atomic, but then if we ever supported cross fs swaps it may become non atomic. I'd at least doc with a line in the --exchange description in usage() to say something like: "Use --no-copy to enforce atomic operation" But --no-copy doesn't mean atomic operation; it simply means don't copy. On systems that don't have an atomic exchange, we can emulate "mv --exchange --no-copy a b" with three calls link("b", "b.tmp"); rename("a","b"); rename("b.tmp","a"). This wouldn't copy, but it wouldn't be atomic. Although atomicity is important, currently the coreutils documentation doesn't cover it and the code doesn't always handle it well. For example, if A and B are regular files "mv -b A B" briefly has a moment when B stops existing. To my mind this is a bug: an existing destination shouldn't stop existing merely because you're replacing it. At some point this stuff should be documented better and this (and probably some other) atomicity bugs fixed. One thought I had while looking into this was that we could add an --atomic option to mv and ln, such that the command fails if the destination cannot be updated atomically. That would be a stronger option than --no-copy. (In some cases we could support --atomic even for 'cp', no?) Anyway, for now I installed the patch with some minor changes to the documentation's --exchange section to try to document this tricky area more clearly. Here's the revised doc section. It also incorporates your later suggestion to mention both data and metadata. @item --exchange @opindex --exchange Exchange source and destination instead of renaming source to destination. Both files must exist; they need not be the same type. This exchanges all data and metadata. This option can be used to replace one directory with another. When used this way, it should be combined with @code{--no-target-directory} (@option{-T}) to avoid confusion about the destination location. For example, you might use @samp{mv -T --exchange @var{d1} @var{d2}} to exchange two directories @var{d1} and @var{d2}. Exchanges are atomic if the source and destination are both in a single file system that supports atomic exchange. Non-atomic exchanges are not yet supported. If the source and destination might not be on the same file system, using @code{--no-copy} will prevent future versions of @command{mv} from implementing the exchange by copying.
Re: Symlink flag for chmod
On 2024-03-19 16:53, Pádraig Brady wrote: > but would appreciate a quick review of the fts_level usage in the second > patch. > > thanks, > Pádraig In utilities that perform tree walks, there may be reasons why not following links is a useful option, but it can't be relied on for security. Suppose Alice wants to traverse path /a/b/c where malicious Mallory is in control of component b. If Alice checks whether b is a symlink and refuses to follow it, she suffers from a TOCtoTOU race. The b component can be a directory at the time Alice checks, but just before Alice subsequently traverses it, Mallory swaps it for a symlink. (Sure, not following symlinks will defend against a passive attack, whereby a user just plants a malicious symlink in a directory they control and then waits.) To do this correctly, we need a path-resolution function that carefully avoids following any insecure/untrusted path component. (This could be built into a secure version of nftw, as an optional behavior; nftw itself should have a flag to not follow unsafe links, implemented with the utmost security.) For this, open openat() with O_NOFOLLOW | O_PATH can be used to go from component to component. Nearly two years ago, I experimented with a solution that doesn't use openat, but only classic POSIX calls, which I call safepath: https://www.kylheku.com/cgit/safepath/ The basic idea is that we can traverse the path left to right, and perform a security check at each step: "does a user other than me control this directory, such that the next path component could go anywhere at any time?" If the answer is no, the resolution can proceed to the next component. Under this view, symlinks can be safe. E.g. root can follow a symlink that only root can modify.
bug#69532: mv's new -x option should be made orthogonal to -t/-T/default
On 3/20/24 12:43, Bernhard Voelker wrote: This stems from the fact that although mv(1) is a userland frontend for renameat(2), the user interface is different: while renameat(2) deals exactly with 2 operands, mv(1) has always been able to work on more arguments. Yes, that's mv's original sin, which we cannot realistically change now. Now, extending "exchange" to more arguments is confusing and the use is not intuitive: mv -v --exchange a b c d An "exchange" can literally only be applied to 2 files, Sure, but that's true for "rename" too: a "rename" can be applied only to 2 files. When d is a directory, "mv a b c d" does three renames so it is like "mv a d/a; mv b d/b; mv c d/c". This remains true if you uniformly replace "mv" with "mv --exchange", which does three exchanges. I have the gut feeling that we didn't think through all cases, and that some might be surprising, e.g.: $ mkdir d; echo 1 > a; echo 2 > d/a $ src/mv --exchange a a a a d/a versus $ src/mv --exchange a a a a d/a I don't understand the word "versus" here, as the two examples look the same to me. If d/a is not a directory, the example is an error, just as it would be without --exchange. If d/a is a directory and you have permissions etc., "mv a a a a d/a" is like attempting "mv -T a d/a/a; mv -T a d/a/a; mv -T a d/a/a; mv -T a d/a/a". If you use plain "mv" only the first "mv -T a d/a/a" succeeds because "a" goes away, so you get three diagnostics for the remaining three "a"s. If you use "mv --exchange" all four "mv --exchange -T a d/a/a" attempts succeed, and since there are an even number of exchanges the end result is a no-op except for updated directory timestamps. So I don't see any ambiguity about what mv should do with this contrived example.
bug#69532: mv's new -x option should be made orthogonal to -t/-T/default
On 3/17/24 07:10, Paul Eggert wrote: Although removing that "mv --swap" implementation was a win, I don't think we can simply delegate this to util-linux's exch command. I still have some headache adding this. This stems from the fact that although mv(1) is a userland frontend for renameat(2), the user interface is different: while renameat(2) deals exactly with 2 operands, mv(1) has always been able to work on more arguments. Now, extending "exchange" to more arguments is confusing and the use is not intuitive: mv -v --exchange a b c d An "exchange" can literally only be applied to 2 files, and 'exch' is IMO fine. I have the gut feeling that we didn't think through all cases, and that some might be surprising, e.g.: $ mkdir d; echo 1 > a; echo 2 > d/a $ src/mv --exchange a a a a d/a versus $ src/mv --exchange a a a a d/a Have a nice day, Berny
bug#10311: RFE: Give chmod a "-h" option as well
On 16/12/2011 16:29, Jan Engelhardt wrote: Hi, chown(1) has a -h option by which it affects symlinks directly rather than the pointed-to file. The bonus side effect is that the pointed-to files don't get changed in any way, which is kinda welcome if you attempt to "fix" permissions/ownership in a directory where an evil user could create a symlink to e.g. /etc/shadow. Attempting chmod -R g+w /home/groups/evilgroup is still a risk, and would necessity a more long-winded command involving find(1). It would therefore be welcome that chmod receive an -h option that just skips over them (besides perhaps attempting to change their permissions as well). Pushed at https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v9.4-162-g07a69fc3b Marking as done. cheers, Pádraig.
bug#11108: [PATCH] chmod: fix symlink race condition
On 28/03/2012 21:28, Paul Eggert wrote: On 03/28/2012 01:13 PM, Jim Meyering wrote: $ ./chmod u+w f ./chmod: changing permissions of 'f': Operation not supported Yeouch. I undid the change for now. Hmm, why did "make check" work for me? I'll have to investigate later, alas. Patch for this pushed at: https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=v9.4-163-g425b8a2f5 Marking this as done. cheers, Pádraig.
Re: RFE: enable buffering on null-terminated data
On Tue, 19 Mar 2024, Zachary Santer wrote: On Tue, Mar 19, 2024 at 1:24 AM Kaz Kylheku wrote: But what tee does is set up _IONBF on its output streams, including stdout. So it doesn't buffer at all. Awesome. Nevermind. Yay! :D And since tee uses fwrite to copy whatever input is available, that will mean 'records' are output on the same boundaries as the input (whether that be newlines, nuls, or just block boundaries). So putting tee in the middle of a pipeline shouldn't itself interfere with whatever else you're up to. (AND it's still relatively efficient, compared to some tools like cut that putchar a byte at a time.) My note about pipelines like this though: $ ./build.sh | sed s/what/ever/ | tee build.log is that with the default stdio buffering, while all the commands in build.sh will be implicitly self-flushing, the sed in the middle will end up batching its output into blocks, so tee will also repeat them in blocks. However, if stdbuf's magic env vars are exported in your shell (either by doing a trick like 'export $(env -i stdbuf -oL env)', or else more simply by first starting a new shell with 'stdbuf -oL bash'), then every command in your pipelines will start with the new default line-buffered stdout. That way your line-items from build.sh should get passed all the way through the pipeline as they are produced. (But, proof's in the pudding, so whatever works for you :D ) Happy putting all the way! Carl