On Nov 11, 2006, at 03:21, Mike Stump wrote:
The cost of my assembler is around 1.0% (ppc) to 1.4% (x86) overhead as measured with -pipe -O2 on expr.c,. If it was converted, what type of speedup would you expect?
Given that CPU usage is at 100% now for most jobs, such as bootstrapping GCC, there is not much room for any improvement through threading. Even in the best case, large parts of the compilation will still be serial. In the non-optimizing case, which is so important for the compile-debug-edit cycle, almost no parallelism will be possible. With LTO, all the heavy lifting will be done at link time, with the initial compilation stripped down to the essentials. Writing out the intermediate representation to .o files directly instead of going through assembly may make more of a difference then. Without invoking the assembler, the number of minor page faults gets reduced by about 10% on Linux. Costs associated with system calls and page faults tend to not scale very well with higher numbers of processors and parallel tasks. So, I think the current approach by using very course granularity parallelism is most efficient. Also, on systems such as Windows and (cough) OpenVMS, spawning processes and performing I/O tend to be more heavyweight. The main place where threading may make sense, especially with LTO, is the linker. This is a longer lived task, and is the last step of compilation, where no other parellel processes are active. Moreover, linking tends to be I/O intensive, so a number of threads will likely be blocked for I/O. -Geert