Re: "wait" loses signals

Robert Elz Mon, 24 Feb 2020 04:23:51 -0800

    Date:        Mon, 24 Feb 2020 11:50:55 +0100
    From:        Denys Vlasenko <dvlas...@redhat.com>
    Message-ID:  <47762f41-e393-30cd-50ed-43c6bdd29...@redhat.com>


  | This is racy. Even if you try to code is as tightly as possible:

Absolutely, I agree.   The question is more whether it really matters.

  | Standard does not say that. It says "when the shell is waiting for an
  | asynchronous command to complete", it does not say "when the shell is
  | waiting in a waitpid() syscall".

That's because the standard has no notion of "system calls", just functions,
but the shell is not actually waiting (it is doing something else) until
the system call causes it to pause if the desired (or any) child is not
ready for reaping.

  | Yes, you are right, you can argue that shell is minimally fulfilling
  | standard's requirement if it does something like my code example.

It doesn't even need to do that.   As I said, the standard's primary purpose
is to advise script writers what they can depend upon the shell providing.
And a race free wrt traps wait utility is not one of those things.  That's
because what scripts can rely upon is based upon what shells implement (or
implemented at the time - with some more recent additions for some more
modern functionality that has been widely adopted).

Even now, as was demonstrated, most shells have this "issue" - hence the
standard simply cannot tell users that they can rely on something else.
Any attempt to read it otherwise than that is simply wrong, and obviously
so (though sometimes it is possible to argue that the wording used does
not express the intent obviously enough - or accoasionally - at all, but
when that happens, all you will ever get as the best possible result is
corrected wording that says what it intended to say in the first place).

The standard also serves to advise shell authors what they need to do to
provide a shell which should run all conformant shell applications, but it
would be grossly unfair (and improper) to require of new shells something
that old ones didn't do.  But that side of it is less relevant to this
discussion, except that it doesn't tell shell authors to make sure there
are no race conditions wrt traps in the wait utility (it would do that in
quite different language than this, but that would be the point, if it were
there).

  | I am arguing that it can be made better:

That part is arguable

  | it can be coded so that signal has no time window to arrive before
  | waitpid() but have its trap delayed to after "wait" builtin ends
  | (which might be "never", mind you).

It can be so coded, but when done (correctly, and assuming a trapped signal
has arrived) it won't be never, the signal will interrupt the sys call that
actually pauses (which will most likely not be wait*() in this case, but that's
irrelevant) and the wait would correctly exit.  A few shells have done that.

The question is whether it is worth going to that extra effort - or in other
words, is it really better.

As best I can tell, it only really matters to shell scripts attempting to
use signals/traps as an IPC mechanism, and that I simply don't believe they
should be doing - programs that need that kind of functionality should be
written in a language that provides more suitable mechanisms (and usually
not only for simple one bit message passing that a signal offers).

There are lots of programming languages around, they each have their particular
niche - the reason their inventors created them in the first place.  Use an
appropriate one, rather than attempting to shoehorn some feature that is needed
into a language that was never intended for it - just because you happen to
be a big fan of that language.   Spread your wings, learn a new one - the hard
part about any programming isn't the programming language, it is getting the
desired concepts and structures straight - do that and any competent programmer
can make a working program in any suitable language (ie: not expecting anyone
to write an operating system in COBOL) fairly quickly.   They'll make it
better after they get used to the idioms of the language, but providing
the method needed to solve the problem is known first (that's usually the
hard part, for anything non trivial) the actual coding into a working, if
not necessarily ideal, form is simple.

kre

Re: "wait" loses signals

Reply via email to