Date: Tue, 9 Jun 2020 08:23:19 -0000 (UTC) From: mlel...@serpens.de (Michael van Elst) Message-ID: <rbngtn$48d$1...@serpens.de>
I have spent a little time looking at this now, and I think it is just all a mess. | pg_jobc is not a reference counter. Maybe not technically a "reference" counter, as what it is counting isn't strictly references, but anything that has x++ and if (--x == 0) do_stuff() is essentially a reference counter. What it is counting references to isn't clear (particularly here), but that is what it is doing, or trying to do (it has all the same issues as things which really are ref counters). | The assertion probably stopped | a bug in a different place by coincidence. I doubt that, this code is not at all good. There is no question but that the counter does not count properly. As best I can work out, and someone correct me if I'm wrong, the whole purpose of pg_jobc is so that orphanpg() can be called when a process group is orphaned (no longer has a session leader). If it has any other use, I cannot see it. What's there now simply doesn't work for this purpose. It was suggested that the FreeBSD code has been modified from what we have, and that simply adopting that might work. I went to look at their code, but before I did that, I saw that a month ago (that is, just around the time of the original discussion here) they copied maxv's KASSERTs into their code. A week ago they removed them again, as they were occasionally firing. That tells me, even without looking at their code, that they (still) have similar bugs to what we do, and thus that just importing their code won't help us. I see 3 ways forward... simply drop the KASSERT the way that FreeBSD have done, and let things return to the semi-broken but rarely bothersome code that was there before. That's not really a good idea, as the sanitizers apparently find problems with the code the way it was (not too surprising, deleting the KASSERT won't fix the bugs, it just stops noticing them explicitly). Or, we could properly define what pg_jobc is counting, and then make sure that it counts whatever that is properly - is incremented in all the appropriate places, and decremented properly as well. Currently the comment about it in proc.h is: /* * Number of processes qualifying * pgrp for job control */ which makes it clear that it is a reference counter (not necessarily counting the number of something which exists, so that something can be deleted, but it is counting references to processes). Unfortunately I have no idea what "qualifying pgrp for job control" is supposed to mean. That could be done, but it seems like a lot of work to me, and not easy. Another (more radical) approach would be to simply drop orphanpg() completely, and thus no longer need pg_jobc at all. The system wouldn't be bothered by this at all - all orphanpg() does is locate stopped members of the process group, and send then SIGCONT (to restart) and SIGHUP (to probably kill them - or at least inform them they their environment has changed). If the system wasn't doing this, users manually (or a script run from cron or something) could do it instead, If not done at all, badly behaving session leaders (processes which don't clean up their stopped jobs before exiting - including ones with bugs that causes them to abort) would over time cause a growing number of stopped jobs to simply clutter the system, consuming various resources, but otherwise harmless (there is nothing special about the processes, they can be killed, or continued - it is just that the process which would normally do that is no longer around). Third, and the option I'd suggest, is to revert to more basic principles, remove the pg_jobc attempt to detect when a session leader has exited, or changed to a different process group, and instead at candidate events (any time a process leaves a process group, for any reason) check if that process was the session leader, and if it is, clean up any now orphaned stopped processes. This is likely to be slower than the current attempt, but is much easier to make correct (and much less likely to cause problems, other than dangling orphaned stopped processes, if incorrect). As best I can tell, all the data needed exists already, all that will be needed is to modify the code. We can even leave pg_jobc in the pgrp struct, to avoid needing a kernel version bump (and for reasons I cannot imagine, pg_jobc is copied into kinfo and einfo structs for sysctl and /proc access to the process data, so leaving it around avoids needing to version those interfaces as well ... the value would be a meaningless 0, always, but I really find it hard to believe that anything would ever care, or even notice). Opinions on any of this before I start banging on the code? kre ps: I don't believe that any of the problems here are race conditions, the counter is simply not maintained correctly (which isn't to say there might not also be race conditions which might contribute).