On 5/14/2012 9:50 AM, Kelly O'Hair wrote:
On May 14, 2012, at 2:49 AM, Magnus Ihse Bursie wrote:

On 2012-04-13 16:46, Jonathan Gibbons wrote:
On 04/13/2012 02:07 AM, Magnus Ihse Bursie wrote:
As for the --with-num-cores, yes, it is a configure time option. The underlying 
assumption is that your hardware doesn't really change, and if your build 
system is too weak, it will be too weak at configure time and at all make 
times. With that said, it is always possible (and not very hard) to re-run 
configure if you need to tweak such a parameter.

As a developer, my machoine may not change, but my expectations may.   
Sometimes I want a build to run in the background while I pursue other 
activities, whether tetris, browsing, or working on the next bug fix.   Other 
times, I need the build ASAP.  So while your underlying assumption is good for 
batch build systems, it may not always be true on developer machines.
We have now implemented the possibitlity to override the default parallelism using the 
standard make -j option. In short, if you just type "make" you will be using 
the number of cores detected by (or explicitely configured in) configure, which will make 
the default build be as fast as possible.
The only issue I have seen with this involves systems with lots of processors, 
but not enough disk, or swap, or ram capacity to
back up that many C++ compilations at the same time.
Historically this 'what's the best make -j N' setting has been difficult with 
all the different systems out there.
I have seen builds go faster with -j 16 than with -j 32 on a 64 processor 
machine, simply because of the disk contention.
To make matters more complicated, the systems are getting faster all around, so 
many of my measurements of the past
don't make much sense.

So I would say, use the default until it becomes too painful to use it (the 
system starts to lockup), then back it down
to a comfortable setting.  In other words, you may need to play with it on some 
systems.

-kto

Yes, what Kelly said: you really don't want to have the build process "auto-determine" what the level of parallelism to use, or any other assumption (including generalizations by humans) method, if you really care about system performance or usability.

A classic example was the T1 and T2 Sun hardware. If we crammed enough RAM into the system to build in a ramdisk, I could run up around -j 32 and see the system scale well. If I was trying to build on the local disks, -j 8 was more than enough to stop seeing any added benefit. -j 12 or so was possible with a SAN disk array. I ran into similar issues with the NUMA-style architecture of the X4600 machines, where even builds in RAM could be hurt by critical components being built on the P6 and P7 processors (those with the largest latency to the P0 CPU).

Having the -j option is a big Win, since you really, really do need to tune it for the type of system you have. There's just no decent heuristics to make a good auto-guess, other than having the system run a whole bunch of system-level benchmarking beforehand, and that's a lot of work for no good return.

You should also consider the actual CPU design architecture if you really care about performance (which, frankly, only those running build farms should care about). Certain CPUs are very good/poor at context switching, and cache amounts/designs can make certain parallelisms better than others (i.e. most SPARC hardware is pretty good at context switches, while the hyperthread-enabled Intel systems not so much).

Decisions like this also matter when considering build hardware for purchase - in general, sufficient RAM to build in a tmpfs (or ramdisk) is more important than extra CPU cores. Similarly, building off a SSD while writing temp files to a small RAMDisk will generally be better than having to write to hard drive or SSD. I'd have to look up my old assumptions, but memory seems to say that 2-4GB of usable RAM was enough to build the JDK in for any number of -j N threads; add requirements for a RAMdisk, possible diskcaching, and the like on top of that.


iostat and vmstat (or equivalent) are your friends when evaluating what -j number to use.

-Erik

Reply via email to