A NOTE has been added to this issue. ====================================================================== https://austingroupbugs.net/view.php?id=1801 ====================================================================== Reported By: mohd_akram Assigned To: ====================================================================== Project: Issue 8 drafts Issue ID: 1801 Category: Shell and Utilities Type: Enhancement Request Severity: Editorial Priority: normal Status: Resolved Name: Mohamed Akram Organization: User Reference: Section: xargs Page Number: 3600-3601 Line Number: 123162, 123252 Final Accepted Text: See https://austingroupbugs.net/view.php?id=1801#c6657. Resolution: Accepted As Marked Fixed in Version: ====================================================================== Date Submitted: 2024-01-25 21:39 UTC Last Modified: 2024-02-26 01:11 UTC ====================================================================== Summary: xargs: add -P option ====================================================================== Relationships ID Summary ---------------------------------------------------------------------- related to 0001811 xargs: add -P option to FUTURE DIRECTIO... ======================================================================
---------------------------------------------------------------------- (0006685) gabravier (reporter) - 2024-02-26 01:11 https://austingroupbugs.net/view.php?id=1801#c6685 ---------------------------------------------------------------------- > But no, by "at the same time" I didn't mean to require some new variant of exec() (and fork()) (or posix_spawn) which could begin execution of all of the invocations at the same instant - rather just that it could be read as requiring the implementation to collect sufficient args to (if there are enough of them available) be able to make the arg lists for maxprocs invocations, and once all of those args are collected then run a fork()/exec() loop to start all maxprocs (or less, if there are insufficient args) invocations of the utility, one after another, so they are all running in parallel, rather than consecutively which might happen otherwise. I don't see how that would result in the invocations necessarily running in parallel - a process created via `fork` could plausibly go all the way though `exec`-ing a program and have that program execute in its entirety before its parent ever gets to do another `fork`. The only way I could imagine for a program to detect whether an `xargs` implementation does as you suggest it could do would be to check in some way if `xargs` has already read further input for its execution or not. > Don't forget to make it clear whether once maxprocs invocations have started, and xargs is waiting for one of them to finish, is it allowed to be collecting more args from stdin so it is ready to start a new invocation of utility as soon as one of the running ones finishes? Or must it just wait, and only start collecting args again after one of the existing invocations completes. It is clear that without -P it isn't allowed to keep collecting args, when it invokes the utility, it is required to just wait for that to exit before doing anything else (otherwise the exit(255) behaviour cannot be implemented). I am not so sure that "It is clear that without -P it isn't allowed to keep collecting args, when it invokes the utility, it is required to just wait for that to exit before doing anything else (otherwise the exit(255) behaviour cannot be implemented)", given that: (echo a; echo b; echo c; echo d >/dev/tty) | xargs -n1 -- sh -c 'sleep 1; echo "$0"' systematically results in d being printed on my terminal before any of a, b or c with all xargs implementations I could test (GNU findutils, FreeBSD, OpenBSD, NetBSD, Illumos, Busybox, Toybox, DragonFlyBSD, MidnightBSD and GhostBSD). Is every existing implementation non-compliant in this regard ? > The spec also needs to be clearer what "at most maxprocs" means - read literally any implementation could implement -P by simply ignoring it, running just 1 invocation of utility at a time, since all that is required is not to run more than maxprocs, which is required to be a positive (integer I assume, though the proposed spec doesn't say that - is "xargs -P 3.7 ..." meaningful, or even "xargs -P seven" ?) and so 1 is certainly "at most maxprocs" for any positive integral value. I don't see what would stop an implementation from simply ignoring the -P option given that I cannot think of a standards-conforming way for an application to distinguish between: - xargs ignoring the -P option - xargs being particularly slow that day (perhaps it got unlucky with scheduling during the entire runtime of each invocation of the utility ? or perhaps it was written in Ruby :p) Also, with regards to the argument being an integer, I suppose this isn't really being argued as anything more than a nitpick, but I can confirm almost every implementation I could find (GNU findutils, FreeBSD, OpenBSD, NetBSD, Illumos, Busybox, Toybox, DragonFlyBSD, MidnightBSD and GhostBSD) parses the argument with one of atoi, strtol, strtoul or strtonum which only parse integers, with the one exception I could find, toybox, parsing number integers with a home-made atolx function which wraps strtoll and appears to accept such inputs as e.g. `-P4k` as equivalent to `-P4000` - that function is always used, seemingly universally, to parse numeric arguments to every utility toybox implements. > Oh, one other thing (or two) - if one were to run > > xargs -P 100 whatever > > and when attempting to start the 51st invocation the system returns EAGAIN to the fork() call, is that to be treated as an error, or that perhaps CHILD_MAX is 50, and so no more than that number of children can be created (so imposing a silent upper bound on maxprocs) or something different? Handling of this, alike to the "what happens on exit status 255" issue, also diverges a lot between implementations: - GNU findutils decides EAGAIN means it should wait for 1 invocation to return before retrying (unless no children are currently executing, in which case it treats that as an error) - FreeBSD, DragonFlyBSD, MidnightBSD and GhostBSD consider any fork (well, to be precise, they use vfork, but that has the same behavior on errors) failure to be a fatal error handled similarly to a child returning 255, meaning they wait for other invocations to finish before terminating and print more diagnostics if other invocations also exit with 255 - OpenBSD, NetBSD, Busybox and Toybox consider any fork (though they also use vfork) failure to be a fatal error, meaning they exit immediately, leaving existing invocations orphaned - illumos decides EAGAIN means it should wait 1 second before retrying (and yes, if fork() always returns EAGAIN that results in an infinite loop) Issue History Date Modified Username Field Change ====================================================================== 2024-01-25 21:39 mohd_akram New Issue 2024-01-25 21:39 mohd_akram Name => Mohamed Akram 2024-01-25 21:39 mohd_akram Section => xargs 2024-01-25 21:39 mohd_akram Page Number => 3600-3601 2024-01-25 21:39 mohd_akram Line Number => 123162, 123252 2024-02-15 16:47 geoffclare Relationship added related to 0001811 2024-02-15 16:52 Don Cragun Note Added: 0006657 2024-02-15 16:53 Don Cragun Status New => Resolved 2024-02-15 16:53 Don Cragun Resolution Open => Accepted As Marked 2024-02-15 16:55 Don Cragun Note Edited: 0006657 2024-02-15 16:55 Don Cragun Tag Attached: issue9 2024-02-15 16:59 Don Cragun Note Edited: 0006657 2024-02-15 17:01 Don Cragun Final Accepted Text => See https://austingroupbugs.net/view.php?id=1801#c6657. 2024-02-16 11:31 kre Note Added: 0006660 2024-02-21 00:20 gabravier Note Added: 0006670 2024-02-21 16:49 gabravier Note Added: 0006672 2024-02-25 06:26 kre Note Added: 0006675 2024-02-25 06:38 kre Note Added: 0006676 2024-02-26 01:11 gabravier Note Added: 0006685 ======================================================================