On 04/03/2013 10:00 PM, Geert Bosch wrote:

This will be true regardless of communication method. There is so little
opportunity for parallelism that anything more than 4-8 local cores is
pretty much wasted. On a 4-core machine, more than 50% of the wall time
is spent on things that will not use more than those 4 cores regardless.
If the other 40-50% or so can be cut by a factor 4 compared to 4-core
execution, we still are talking about at most a 30% improvement on the
total wall time. Even a small serial overhead for communicating sources
and binaries will still reduce this 30%.
For stage2 & stage3 optimized gcc & libstdc++, I'd tend to disagree (oh yea, PCH generation is another annoying serialization point). Communication kills on things like libgcc, libgfortran, etc where the bits we're compiling at any given time are trivial.

I haven't tested things in the last couple years, but I certainly had to keep the machines roughly in the same class performance-wise. It's a real killer if insn-attrtab.o gets sent to the slowest machine in the cluster.

And having an over-sized cluster allows the wife & kids to take over boxes without me ever really noticing (unless I've got multiple builds flying).


We need to improve the Makefiles before it makes sense to use more
parallelism.  Otherwise we'll just keep running into Amdahl's law.
Can't argue with that. We're still leaving significant stuff on the floor due to lameness in our makefiles.

jeff

Reply via email to