Re: [PATCH] make: implement jobserver and use it to avoid exponential behavior

Lauri Tirkkonen Thu, 14 Nov 2019 07:32:13 -0800

Hi,

thanks for your comments. More responses inline, but I'd like to mention
up front that further testing I've made agrees with you; this isn't
ready by any means. I observed one hang I've yet to debug (very bad),
and the speedups weren't that great in some cases either (more on that
later).

I don't have a new version of the diff fixing the problems (at least not
yet), but I'll try to give a bit more motivation/background for my
changes.

On Thu, Nov 14 2019 15:24:53 +0100, Marc Espie wrote:
> Sorry, I missed your initial message (thanks to naddy@ for telling me about 
> it)
> 
> A jobserver was considered in make before implementing expensive.
> 
> The main reason it was not done that way is that you can't guarantee a
> file descriptor will make it through the intermediate shell.  
> 
> In fact, POSIX doesn't guarantee anything, and a lot of shells actively close
> file descriptors they don't know about.

Yes, I'm aware of this. Other makes seem to do fine though, and they use
the same mechanism of passing file descriptors: leave them open, and
communicate their existence through environment variables.

However, it's a good point; failure to access the fd should fall back to
single-job semantics in the child at that point, and I need to consider
what happens in the top-level make in such a case as well. More testing
is obviously needed.

> Your example is not complicated enough, as your commands mean NO 
> shell will be spawned (everything will pass directly through to 
> submakes, which is BTW a large reason for implementing -C, which cuts 
> down on the number of intermediate processes).

True, the test needs more work. Thanks for pointing it out, I somehow
forgot about the optimization about avoiding a shell while I was writing
the previous mail.

> I'm not sure I like added complexity in that part of make.
> 
> > I haven't yet built the entire tree with this, just cautiously probing
> > the waters first to see if there is interest :)
> 
> I'd like some actual numbers on real world applications: what does it change
> for the actual build through the full source tree.   Areas like perl are where
> the job explosion occurs and where the expensive heuristics got refined.

As I mentioned in the preface above, a full build actually ended up
hanging somewhere in gnu/ :-)

However I do have some results from smaller real-world tests.
Unfortunately I haven't yet gotten around to bigger ones (and besides I
need to fix the outstanding issues first for the tests to make actual
sense).

The first test I made was on a beefy virtual machine under Hyper-V,
with 32 vCPUs on a Threadripper 2950X. With and without this patch (and
the followup for share/mk), I did make -j32 -C /usr/src/usr.bin. The
results were very disappointing: with my patches the build was *slower*
even though it used way more CPU. I think 'top' said it best:

32  CPUs:  0.1% user,  0.0% nice,  3.9% sys, 95.4% spin,  0.0% intr, 0.7% idle
                                             ^^^^^^^^^^

The Hyper-V guest uses an IDE disk, and my hypothesis is that there may
be some locking problems there.

I repeated the test on actual hardware, an Intel Atom C2550 board with 4
CPUs, and SATA disks. This showed an actual build time improvement
(though I admit I just did one trial so far):

before
        make -j4  817.49s user 340.95s system 268% cpu 7:11.44 total
after
        make -j4  831.34s user 352.72s system 373% cpu 5:16.88 total

which is not too shabby if you ask me.

> Comments on the patch proper:

> >  static int aborting = 0;       /* why is the make aborting? */
> >  #define ABORT_ERROR        1           /* Because of an error */
> > @@ -123,12 +128,15 @@ static int    aborting = 0;       /* why is the make 
> > aborting? */
> >  
> >  static int maxJobs;        /* The most children we can run at once */
> >  static int nJobs;          /* Number of jobs already allocated */
> > -static bool        no_new_jobs;    /* Mark recursive shit so we shouldn't 
> > start
> > -                            * something else at the same time
> > -                            */
> I don't think ditching expensive entirely is a good idea anyhow.
> What happens if the jobserver fails ?  I see it goes to err all the time.
> BTW, going to err is bad. Make handles processes. Most errors MUST do 
> something sane, like waiting for processes to end in many many cases.

You're absolutely right, error handling is currently complete garbage.
And jobserver failures should be handled gracefully, I agree. I'll see
if I can come up with something.

> > +
> > +struct jobserver {
> > +   int     master;         /* master deposits job tokens into this fd */
> > +   int     slave;          /* slaves consume job tokens from this fd */
> > +   volatile sig_atomic_t tokens;   /* currently available tokens */
> > +};
> you didn't really read the code, did you ? why volatile sig_atomic_t ?

I wrote this code, I'm not sure how I could have avoided reading it :-)

But I must concede, I've dropped and resumed this work a few times;
'tokens' was accessed from a signal handler in a previous iteration,
which is why it was sig_atomic_t. The version I posted does not do that,
so indeed, what you're seeing is my sloppy cleanup job. Thanks.

> > @@ -347,10 +360,9 @@ print_errors(void)
> >  static void
> >  setup_signal(int sig)
> >  {
> > -   if (signal(sig, SIG_IGN) != SIG_IGN) {
> > -           (void)signal(sig, notice_signal);
> > -           sigaddset(&sigset, sig);
> > -   }
> > +   sigaction(sig, &(struct sigaction) { .sa_handler = notice_signal },
> > +       NULL);
> > +   sigaddset(&sigset, sig);
> >  }
> Why did you change this ? this is gratuitous.

Actually, it's not gratuitous, but it's subtle. signal() enables
SA_RESTART, but we depend on EINTR returns in a couple places, so I did
this. I should've mentioned this originally, sorry.

> Besides, I'm not 100% sure you can do that, I think it will break m88k.

I'm not sure how I could have known or tested that, unfortunately.

> >  static void
> > @@ -551,20 +563,15 @@ postprocess_job(Job *job, bool okay)
> >             Finish();
> >  }
> >  
> > -/* expensive jobs handling: in order to avoid forking an exponential number
> > - * of jobs, make tries to figure out "recursive make" configurations.
> > - * It may err on the side of caution.
> > +/* expensive jobs handling: make tries to figure out "recursive make"
> > + * configurations. It may err on the side of caution.
> >   * Basically, a command is "expensive" if it's likely to fork an extra
> >   * level of make: either by looking at the command proper, or if it has
> >   * some specific qualities ('+cmd' are likely to be recursive, as are
> >   * .MAKE: commands).  It's possible to explicitly say some targets are
> >   * expensive or cheap with .EXPENSIVE or .CHEAP.
> ??? if you kill half that thing, kill it all, or not. Looks like half-baked
> code.
> 
> > - * While an expensive command is running, no_new_jobs
> > - * is set, so jobs that would fork new processes are accumulated in the
> > - * heldJobs list instead.
> > - *
> > - * This heuristics is also used on error exit: we display silent commands
> > + * This heuristics is used on error exit: we display silent commands

Yeah, you're right, it reads like crap, I'll rewrite it. The gist of the
thing is that 'expensive' heuristic is now only used to decide whether
or not to print the failed command.

> >  static void
> > -loop_handle_running_jobs()
> > +loop_handle_running_jobs(void)
> >  {
> >     while (runningJobs != NULL)
> >             handle_running_jobs();
> >  }
> Yet another gratuitous change.  There is already a perfectly fine prototype
> that asserts (void).

A function prototype with empty parentheses does not assert (void) in C
(but does in C++). However, you're correct about it not being related to
this change.

> I'll stop commenting at this point, because this code is nowhere near 
> fine for now.

Yeah, agreed.

> I'm not saying having a jobserver might not be a good idea, AS AN OPTION, 
> but this would need to look like openbsd code, compile on every 
> openbsd architecture, *and have proper error paths when it fails*.

Yes, absolutely. Like I said I was just cautiously probing the waters
with the initial diff, to see if there is interest at all, before
spending more time on the polish.

> You have zero guarantee this works on all shells.
> 
> The expensive heuristics might be the only reliable way we have of
> limiting the proliferation of jobs, so error recovery must be way more
> graceful.
> 
> It also won't play well with other makes (contrary to expensive).
>
> You didn't say whether there is a convention that (say) gmake or other
> bsd makes use for a job server... might be useful to communicate correctly
> with these (right now, what happens with MAKEOPTIONS and a child gmake,
> for instance).

Off the top of my head, bmake uses -J3,4 in MAKEFLAGS to signify which
fd's contain the pipes for jobserver communication. gmake does the same
(--jobserver-auth=3,4). I could make a more thorough writeup of the
mechanisms these two use as opposed to what I did, if you wish.

> Call me paranoid, but I'm not too fond of having an extra fd that passes
> through each and every subprocess.

This was also a bit concerning to me, but that's what bmake/gmake also
are doing (albeit with more fd's). Eventually the potential gains
outweighed my concerns: every SUBDIR in bsd.subdir.mk is currently built
serially, which has quite a large impact on how many jobs are running at
any one time.

An alternative could be to use unix sockets with paths instead and pass
the path through the environment, but I'm not convinced I like that any
better.

> The first step would definitely be to check how your patch works in
> real conditions (at least a full make release, and maybe a significant
> subset of ports).
> 
> Assuming this goes well, for starters, I'd like the change to be way 
> more clearly separated, to not included gratuitous changes to existing code, 
> to look like OpenBSD code, and to have decent error paths.

Thanks; I take it this means there is at least a little interest in this
proof of concept :)

I'll resume work on this to polish it more, fix the remaining bugs and
try to address your concerns. I did try to make it look like OpenBSD
code, but obviously failed on some counts; please bear with me with any
future mistakes I might make on that path.

And I suppose I'll need some help on the "compiles on all architectures"
part eventually; I just have amd64/arm64 stuff.

-- 
Lauri Tirkkonen | lotheac @ IRCnet

Re: [PATCH] make: implement jobserver and use it to avoid exponential behavior

Reply via email to