Date:        Fri, 29 Apr 2022 20:11:55 +0100
    From:        "Harald van Dijk via austin-group-l at The Open Group" 
<austin-group-l@opengroup.org>
    Message-ID:  <b2cf3740-fdd3-d0b9-a3a8-beaab525f...@gigawatt.nl>

  | >    | It also appears that dash still implements remove-before-prompting.
  |
  | busybox ash and my shell do as well, but both are derived from dash and 
  | have merely retained dash's behaviour.

All ash derived shells work that way.

  | > Does anyone not?
  |
  | bash does not. bosh does not. ksh does not. mksh does not. posh does 
  | not. yash does not. zsh does not.

I did a test (not the same one you did) after I sent the mail, and saw
that bosh and yash don't.   For the other shells, it is not nearly as
clearcut what is happening.

  | You can test this by doing
  |
  |    true &
  |    <press Enter; most shells will show the background command exited>
  |    wait $!; echo $?
  |
  | This should print 0. Then do the same, except with the first command 
  | changed to false &. That should print 1.

Yes, in the shells you mention it does, indicating that something different
is happening.   It is interesting that in bash you can do that wait over and
over again, and it keeps returning the 0 status (until one does a plain "wait"
command, even the "jobs" command doesn't remove it, though the standard
requires that it do so).   bash is the only shell that acts like that, whether
it is intentional or not I have no idea.

But try a different test

        true & X=$!

(the assignment to X is just in case there is a shell which implements that
"no need to retain" stuff when $! is not referenced).

Then repeat that line over and over. (Consecutive lines).

In ash derived shells (and pdksh) the first will report job 1 starting
(assuming you had none already running), the 2nd line will report job 2
starting, and before prompting for the 3rd, report job 1 has finished.
The third will be job 1 again, and report job 2 has finished, and that
continues over and over again.

This is all consistent with how we know that they work.

In bosh and yash, the job number just keeps on climbing, even though they
report the previous job finished as each subsequent one is started.  That's
also consistent with how they operate.   A simple "wait N" for one of the
jobs removes that one from the list, then more true& commands add more jobs.
A simple "wait" clears up everything.   In yash "jobs" reports them all 
finished and clears everything, as it should.  In bosh "jobs" reports them all
finished, but clears nothing (the jobs command can be repeated over and over
and keeps reporting all the completed jobs).   That's clearly broken.

zsh does something different, once a job has been reported as finished
at a prompt, it is removed from the jobs table, and you can no longer do
"wait %3" for it, but the pid and status seem to be remembered somewhere
else, and wait <pid> gets the status from the job.   That seems odd to me,
it should be possible to use either form to wait on a job.   (I should note
that there is something odd about my zsh install - I tend to need to type
two newlines after a command to get it executed, both are seen by the shell.
Most of the time that's just mildly annoying, when I forget the 2nd, nothing
happens, and I have to wake up and remember that zsh is waiting for the 2nd
before it will do anything with the command - but in testing like this, where
the newlines generate prompts, and the accompanying the prompt is an action
we care about, it kind of ruins the test.)

ksh93 is similar (without the double newline issue).

mksh is almost similar, but in it I saw
        internal error: j_async: bad nzombie (161)
twice (once, then more testing, then again), which does not look good.
I don't know what the 161 represents, it was not the same each time, but
is not a pid of any of the jobs started.  A count?

In that one, with this sequence, there are only ever 2 jobs (as in job
numbers) assigned, as each is started, the previous one is reported finished,
and removed from the jobs table.  It is possible to wait %n for the job
number most recently started, but only that one (were the commands to run
for longer, then presumably it would be possible to wait on any not completed
and reported as completed).

bash is different again, it counts up the job numbers, like bosh and
yash, but as it reports each earlier one finished, removes it from the
jobs table, so the "jobs" command only ever shows (and then removes) the
last one started.   It still allows wait N to return the status, as many
times as you want to do that command, but not wait %n for any but the
most recently created one.

  | I consider the dash behaviour a bug, but do not want to 
  | fix it in a way that introduces another bug.

While removing jobs that have been reported (ie: removing them as
soon as possible) might reduce the risk of getting duplicate pids,
it doesn't actually solve the problem.   In particular, the removal
only happens in interactive shells (ones which prompt) so does nothing
at all for scripts, which have the same issue.   It can also happen in
an interactive shell, if you're unlucky.

The bigger issue is what do you do about users who can be connected to
their shell for weeks, running lots of background commands, and never
issuing a wait or jobs command?   Do you just keep remembering exit
status/pid pairs forever?   That doesn't sound sustainable to me.

kre

  • When can shells remo... Geoff Clare via austin-group-l at The Open Group
    • Re: When can sh... shwaresyst via austin-group-l at The Open Group
    • Re: When can sh... Robert Elz via austin-group-l at The Open Group
      • Re: When ca... Harald van Dijk via austin-group-l at The Open Group
      • Re: When ca... Robert Elz via austin-group-l at The Open Group
        • Re: Whe... Chet Ramey via austin-group-l at The Open Group
          • Re:... Steffen Nurpmeso via austin-group-l at The Open Group
      • Re: When ca... Geoff Clare via austin-group-l at The Open Group
        • Re: Whe... Chet Ramey via austin-group-l at The Open Group
          • wai... Geoff Clare via austin-group-l at The Open Group
            • ... Chet Ramey via austin-group-l at The Open Group
            • ... Robert Elz via austin-group-l at The Open Group
              • ... Chet Ramey via austin-group-l at The Open Group
      • Re: When ca... Geoff Clare via austin-group-l at The Open Group
        • Re: Whe... Chet Ramey via austin-group-l at The Open Group

Reply via email to