On Sun, Nov 27, 2011 at 4:27 AM, Mick <michaelkintz...@gmail.com> wrote: > On Saturday 26 Nov 2011 15:22:15 Michael Mol wrote: >> I just wanted to share an experience I had today with optimizing parallel >> builds after discovering "-l" for Make... >> >> I've got a little more tweaking I still want to do, but this is pretty >> awesome... >> >> http://funnybutnot.wordpress.com/2011/11/26/optimizing-parallel-builds/ >> >> ZZ > > Thanks for sharing! How do you determine the optimum value for -l?
I'm making an educated guess. >.> I figure that the optimal number of simultaneous CPU-consuming processes is going to be the number of CPU cores, plus enough to keep the CPU occupied while others are blocked on I/O. That's the same reasoning that drives the selection of a -j number, really. If I read make's man page correctly, -l acts as a threshold, choosing not to spawn an additional child process if the system load average is above a certain value Since system load is a count of actively running and ready-to-run processes, you want it to be very close to your number of logical cores[1]. Since it's going to be a spot decision for Make as to whether or not to spawn another child (if it hits its limit, it's not going to check again until after one of its children returns), there will be many race cases where the load average is high when it looks, but some other processes will return shortly afterward.[2] That means adding a process or two for a fudge factor. That's a lot of guess, though, and it still comes down to guess-and-check. emerge -j8 @world # MAKEOPTS="-j16 -l10" Was the first combination I tried. This completed in 89 minutes. emerge -j8 @world # MAKEOPT="-j16 -l8" Was the second. This took significantly longer. I haven't tried higher than -l10; I needed this box to do be able to do things, which meant installing more software. I've gone from 177 packages to 466. [1] I don't have a hyperthreading system available, but I suspect that this is also going to be true of logical cores; It's my understanding that the overhead from overcommitting CPU comes primarily from context switching between processors, and hyperthreading adds CPU hardware specifically to reduce the need to context-switch in splitting physical CPU resources between threads/processes. So while you'd lose a little speed for an individual thread, you would gain it back in aggregate over both threads. [2] There would also be cases where the load average is low, such as if a Make recipe calls for a significant bit of I/O before it consumes a great deal of CPU, but a simple 7200rpm SATA disk appears to be sufficiently fast that this case is less frequent. -- :wq