Hmmm okay...

Same compilation, this time with distcc_hosts set to:
localhost/4 P4_1.5/2 dualppro/2 p233 pentium pentium pentium pentium
make -j50  CC=distcc
31.629u 4.264s 1:07.39 53.2%    0+0k 0+0io 391pf+0w


localhost/4 P4_1.5/2 p233
make -j50  CC=distcc
30.097u 4.212s 1:33.28 36.7%    0+0k 0+0io 560pf+0w
make -j7  CC=distcc
34.586u 3.992s 0:24.50 157.4%   0+0k 0+0io 1pf+0w

So yeah you're all right - the -j number should reflect the quantity of
machines available in the distcc pool, and they shouldn't be too old else
the benefits are lost.



-----Original Message-----
From: Nick Rout [mailto:[EMAIL PROTECTED] 
Sent: Friday, 25 November 2005 2:48 p.m.
To: [email protected]
Subject: Re: distcc



On Fri, 25 Nov 2005 14:32:06 +1300
Craig FALCONER wrote:

> I just tried compiling an app using various combinations of distcc 
> settings. My desktop is a P4 at 3 GHz
>  
> make -j8
> 43.882u 4.608s 0:41.31 117.3%   0+0k 0+0io 292pf+0w
> 
> Add a P4 1.5 and turn on distcc
> 35.190u 3.872s 0:30.98 126.0%   0+0k 0+0io 1pf+0w
> 
> Add a dual PPro 200
> 33.142u 3.960s 0:27.47 135.0%   0+0k 0+0io 0pf+0w
> 
> Add a single pentium 233
> 33.518u 3.880s 0:29.49 126.7%   0+0k 0+0io 0pf+0w
> 
> Add four more pentium machines
> 27.453u 3.940s 0:44.51 70.5%    0+0k 0+0io 0pf+0w
> 
> So it seems that theres a point where too many machines in a distcc 
> chain make it suboptimal.  To test that I set my distcc_hosts to 
> localhost and the five pentium class machines.
> 37.002u 4.016s 0:54.17 75.7%    0+0k 0+0io 0pf+0w
> 
>  
> In short - distcc requires your compile farm to be not-too-old.  Have 
> I made any wrong assumptions?  Maybe the -j8 should increase as the 
> machine pool grows?
>  

-j should be approx the number of processors plus 1 or 2 (some people will
claim different algorithms.)

The answer to "what is ideal?" is "it depends". 

A job where you are compiling very many small files is different to one with
a few large files - the number of network transmissions are reduced for a
start. 

Also the "main" machine must be relatively fast as it has to do all the
preprocessing - in fact you may want to exclude it from the compile farm
list so it can concentrate on preprocessing and linking.

if you have machines that can do more work than others you can give them a
weighting in the host list, i think the syntax is x.x.x.x/2 and it will  get
twice the work of the other machines.

another point is that make will only traverse one directory at a time.
consider the position if there are 8 files the same size in a directory and
7 machines to compile on. the first seven files all get done at the same
time, then the 8th gets done (by only one machine) and the other 6 are idle
until make moves to a new directory.

lastly the included monitoring programs are useful for noting what is and
isn't happening.

its worth a trip to the distcc site http://distcc.samba.org/



-- 
Nick Rout <[EMAIL PROTECTED]>

Reply via email to