Again, my DMUCS solution provides load balancing.  DMUCS is basically a wrapper around distcc.  You run a "host-server" and you run very simple load-average-reporting tasks on each host (along with the distccd's).  Then, your compile contacts the host server for a host, places the result in DISTCC_HOSTS environment variable, and runs distcc.

I am in the process of getting my code onto sourceforge in cvs.  It may even happen today.

Please let me know if you want me to send you the code directly.

Vic
- Show quoted text -
 

Let me try to understand this correctly:

distcc is a wrapper around gcc, and dmucs (or whatever it is called) is a wrapper around distcc.  When dmucs is run, it reads data from the locally running host server for the DISTCC_HOSTS variable and runs distcc.  All of the distcc servers must run a distcc server and a client to the host server.  These clients send information about the load state of each server to the host server, so it has a good idea of who is idle, and will set the DISTCC_HOSTS variable correctly.

Did I understand that correctly?  Have you measured improved performance over distcc using this setup?  Why do you need a server and client setup, why can't the wrapper measure how long distcc takes for that given host and submit it to the host server itself?  I think there may be a few problems, but I'm not sure.  I don't see how this would speed up a compile.  distcc already, sends files to compile to the host that has the least jobs currently.  Because two files can compile on a host in parallel at nearly the same speed as serially with a multitasking os, as long as the makeopts are set high enough "-j8" then all of the cpus can be utilized well.  The problem is when there are less files left than cpus to compile them, then some cpus become idle.

Distcc slows down when a big file is given to a slow computer, and a small file is given to a fast computer, and there are no more files that can be build until the big file completes.  I don't see how you can solve this problem if you program is just a wrapper.  Distccd needs to measure the compile time as well as something to go with it (I suggest file size) to calculate the average speed of the server.

The distcc client would look at all the files that can potentially be built in parallel,  There are a lot of ways to do this, the easiest is to send the files in order from largest to smallest, this is not perfect, but this way the longest you have to wait is the time it takes the slowest computer to compile the smallest file.  Another way would be to try to fit the files so that the computers will all finish as close to the same time as possible,  If it's faster to send two files to a fast computer and leave a slow one idle, it should be done.  Finally, if the compile has been done before, the distcc client can cache how long it takes to compile certain files.  Using this information instead of file size to determine how to send the files to various computers would further increase efficiency.

Another thing to mention, files that open up other files to compile should have a higher priority and be compiled first.

I realize these suggestions greatly complicate distcc, it would need to either interpret makefiles directly, or return immediately, and store the options passed to it in a build tree.  I think it makes sense to have distcc be a wrapper to make instead of gcc.

-sean

On 1/4/06, Victor Norman <[EMAIL PROTECTED]> wrote:
Again, my DMUCS solution provides load balancing.  DMUCS is basically a wrapper around distcc.  You run a "host-server" and you run very simple load-average-reporting tasks on each host (along with the distccd's).  Then, your compile contacts the host server for a host, places the result in DISTCC_HOSTS environment variable, and runs distcc.

I am in the process of getting my code onto sourceforge in cvs.  It may even happen today.

Please let me know if you want me to send you the code directly.

Vic



Sean D'Epagnier <[EMAIL PROTECTED]> wrote:
I'm also very interested in this subject.  I think there are a lot of advanced ways to do it, but a few simple ones would really help performance.  If anyone wants to contact me about working on this, or would like to share ideas, feel free to.


On 1/3/06, Patrik Olesen < [EMAIL PROTECTED]> wrote:
Hello,

I have seen that there where an old thread about developing some sort of
loadbalancing for the deployment of the compiler jobs. What is the
progress of this work does anybody have any news?

Best regards,
  Patrik
__
distcc mailing list             http://distcc.samba.org/
To unsubscribe or change options:
https://lists.samba.org/mailman/listinfo/distcc

__
distcc mailing list http://distcc.samba.org/
To unsubscribe or change options:
https://lists.samba.org/mailman/listinfo/distcc


Yahoo! Photos
Ring in the New Year with Photo Calendars. Add photos, events, holidays, whatever.


__ 
distcc mailing list            http://distcc.samba.org/
To unsubscribe or change options: 
https://lists.samba.org/mailman/listinfo/distcc

Reply via email to