From: Paul Smith > Sent: 12 May 2020 15:36 > On Tue, 2020-05-12 at 07:55 +0000, David Laight wrote: > > > One problem is ensuring that all the recursive makes actually > > > use the same token queue. > > > The Linux kernel build acts as though the sub-makes have their > > > own queue - I certainly had to fix that as well. > > I don't understand this... I guess I'm not familiar enough with the > kernel build system.
Don't worry. I'm rather guessing how gmake and the kernel makefile interact based on changes I made to NetBSD's make and makefiles almost 20 years ago. I think there were some sub-makes that were started with make instead of $(MAKE) so ended up creating a new job pipe. (The pipe fds were added to argv[] by $(MAKE)) > > I think I've remembered the obvious thing that made it work better. > > > > When a job ends it is important to get a new token from the jobserver > > rather than reusing the one to hand. > > Otherwise you don't seen the 'abort' marker for ages. > > If GNU make retrieved a token then it will always put that token back > into the jobserver pipe when the job ends, and get another one when the > next job is to start. To do otherwise would mean that some makes could > hoard tokens. > > However, the jobserver is implemented such that make itself is not > considered a job, even a sub-make. The way it works is that when you > invoke a recursive make the parent make will obtain a jobserver token > for that recursive invocation (like it does for every job), then that > sub-make can "pass on" that token: in other words, the sub-make has a > free token that it can always use without querying the jobserver. > > This way every invocation of recursive make can always make progress, > at least serially. > > I can see that in the "fast fail" model this could be problematic, but > it should only ever be an issue in situations where a sub-make was > running serially for some reason: either the structure of the > prerequisites means it's naturally serial, or else someone added > .NOTPARALLEL to the makefile or something. As soon as make wants to > run a second job in parallel it will go to the jobserver and discover > the "failure" token. > > Changing this will require thought. We can't just skip the free token > otherwise you can get into a state where all your tokens are used by > recursive makes and no make can get a new token to run a job. > > I can see two possible solutions: > > First, when a sub-make starts it could put back one token into the > jobserver, representing the token the parent make obtained for it, then > proceed to always get a token before every job (no free token). This > means that sometimes a sub-make won't be able to run any jobs at all: > it can get locked out waiting for a token. Maybe that's not a problem. > > The other idea is to keep the free token but make it a last resort > rather than a first resort. This has the nice properties that (a) > we'll see failures fast and (b) we still have a free token, but the > code is more complex: basically we'd need to perform a non-blocking > read on the jobserver FD and if we didn't get anything back, we'd use > our free token if it's still available: if not we'd do a blocking read > on the jobserver FD to wait for a new token. Doesn't it do blocking reads with SIGCHLD enabled? (or hopefully ppoll() to avoid the race) Another option is for the 'parent' make to return (or not acquire) a job token for $(MAKE) commands. Then the sub-make have to acquire a token for every command. make has to know about $(MAKE) because they are special in all sorts of ways. But that won't work well if the old and new versions ever interact. Or, require the sub-make acquire a token in order to exit. Then it can free the token when every job terminates. I can't remember what I did to netbsd's make. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)