Date:        Tue, 9 Jun 2020 08:23:19 -0000 (UTC)
    From:        mlel...@serpens.de (Michael van Elst)
    Message-ID:  <rbngtn$48d$1...@serpens.de>

I have spent a little time looking at this now, and I think
it is just all a mess.

  | pg_jobc is not a reference counter.

Maybe not technically a "reference" counter, as what it is counting isn't
strictly references, but anything that has x++ and if (--x == 0) do_stuff()
is essentially a reference counter.   What it is counting references to
isn't clear (particularly here), but that is what it is doing, or trying
to do (it has all the same issues as things which really are ref counters).

  | The assertion probably stopped
  | a bug in a different place by coincidence.

I doubt that, this code is not at all good.   There is no question but
that the counter does not count properly.

As best I can work out, and someone correct me if I'm wrong,
the whole purpose of pg_jobc is so that orphanpg() can be called
when a process group is orphaned (no longer has a session leader).

If it has any other use, I cannot see it.

What's there now simply doesn't work for this purpose.   It was
suggested that the FreeBSD code has been modified from what we
have, and that simply adopting that might work.   I went to look
at their code, but before I did that, I saw that a month ago
(that is, just around the time of the original discussion here)
they copied maxv's KASSERTs into their code.  A week ago they removed
them again, as they were occasionally firing.   That tells me,
even without looking at their code, that they (still) have similar
bugs to what we do, and thus that just importing their code won't
help us.

I see 3 ways forward...   simply drop the KASSERT the way that FreeBSD
have done, and let things return to the semi-broken but rarely bothersome
code that was there before.   That's not really a good idea, as the
sanitizers apparently find problems with the code the way it was (not
too surprising, deleting the KASSERT won't fix the bugs, it just stops
noticing them explicitly).

Or, we could properly define what pg_jobc is counting, and then make sure
that it counts whatever that is properly - is incremented in all the
appropriate places, and decremented properly as well.   Currently
the comment about it in proc.h is:
        /*
         * Number of processes qualifying
         * pgrp for job control 
         */
which makes it clear that it is a reference counter (not necessarily
counting the number of something which exists, so that something can be 
deleted, but it is counting references to processes).   Unfortunately
I have no idea what "qualifying pgrp for job control" is supposed to mean.

That could be done, but it seems like a lot of work to me, and not easy.

Another (more radical) approach would be to simply drop orphanpg()
completely, and thus no longer need pg_jobc at all.   The system
wouldn't be bothered by this at all - all orphanpg() does is locate
stopped members of the process group, and send then SIGCONT (to restart)
and SIGHUP (to probably kill them - or at least inform them they their
environment has changed).   If the system wasn't doing this, users manually
(or a script run from cron or something) could do it instead,  If not done
at all, badly behaving session leaders (processes which don't clean up
their stopped jobs before exiting - including ones with bugs that causes
them to abort) would over time cause a growing number of stopped jobs to
simply clutter the system, consuming various resources, but otherwise
harmless (there is nothing special about the processes, they can be killed,
or continued - it is just that the process which would normally do that
is no longer around).

Third, and the option I'd suggest, is to revert to more basic principles,
remove the pg_jobc attempt to detect when a session leader has exited,
or changed to a different process group, and instead at candidate events
(any time a process leaves a process group, for any reason) check if that
process was the session leader, and if it is, clean up any now orphaned
stopped processes.   This is likely to be slower than the current attempt,
but is much easier to make correct (and much less likely to cause problems,
other than dangling orphaned stopped processes, if incorrect).

As best I can tell, all the data needed exists already, all that will be
needed is to modify the code.   We can even leave pg_jobc in the pgrp
struct, to avoid needing a kernel version bump (and for reasons I cannot
imagine, pg_jobc is copied into kinfo and einfo structs for sysctl and /proc
access to the process data, so leaving it around avoids needing to version
those interfaces as well ... the value would be a meaningless 0, always, but
I really find it hard to believe that anything would ever care, or even 
notice).

Opinions on any of this before I start banging on the code?

kre

ps: I don't believe that any of the problems here are race conditions,
the counter is simply not maintained correctly (which isn't to say
there might not also be race conditions which might contribute).

Reply via email to