Fergus Henderson wrote:
On Tue, Apr 28, 2009 at 4:28 PM, Robert W. Anderson
<anderson...@poptop.llnl.gov <mailto:anderson...@poptop.llnl.gov>> wrote:
I have an environment where we have many nodes potentially available
for compilation, and all of them see the same file spaces via NFS.
We are seeing decent performance out of distcc 3.1 using pump mode,
but from reading the docs there may be big performance gains left to
wring out in this special(?) case.
If I understand correctly, distcc's pump mode finds a set of header
files necessary to send along with the source file to enable
compilation on a remote node. In a homogeneous environment, it
seems both steps here are unnecessary if the master and slave nodes
are more or less indistinguishable in terms of compiler, sources,
and headers.
I think we could really achieve some screaming compile times (over
thousands of source files) if these steps could be bypassed with the
user's explicit acknowledgement that he is making assumptions about
the homogeneity of his build server machines.
How extensive would the modifications be to support such an
optimization? It was not clear to me after a few minutes of poking
around in the source, and thought I'd seek an expert opinion first.
Typically NFS is a lot slower than local file access.
So it's not clear that this approach would actually improve overall
performance.
Distcc can work faster than NFS, because it sends all of the source
files at once, requiring only one round-trip between the client and the
distcc server for each compilation. With NFS, you need a round-trip
between the distcc server and the NFS server for each header file that
is included (directly or indirectly) from the source file being compiled.
Of course with distcc, if your source files are on NFS, the client needs
to do the same round-trips to the NFS server to fetch the files, but
this is not as bad as having the distcc servers do that, because the
distcc client need only fetch each file once for the whole build, not
once for each compilation in which it is referenced, and after that the
file will probably be cached. In addition, the client machine is more
likely to have source files cached from previous builds, since on the
client machine you're probably compiling the same sources that you
compiled last time, whereas on the distcc server machines they are
serving lots of different users who may be compiling very different
programs.
Another issue with this approach is that there may also be additional
security considerations. Currently distcc servers normally run as user
"distcc", which may not have access to the user's NFS files, so this
approach would not work if the source files are not world-readable. Of
course it would be possible to address this issue by having the distcc
server authenticate the user, and then access the user's files on NFS as
that user, but that would require additional authentication, which would
have a performance impact. For example one way to do it would be to use
distcc's ssh mode, but that mode has a major performance impact. (The
recently posted patches for GSSAPI support have less performance impact,
but there is still a significant impact.)
For the approach that you are considering, you may not need to use
distcc at all;
a simple script using ssh may be sufficient, though the overheads of ssh
may be prohibitive (ssh connection sharing may help with that, although
that has security concerns of its own).
If you do want to modify distcc, I'd guess that the modifications needed
would be moderate in scope.
Fergus,
Thanks for the clear and detailed reply. First I should note that I am
already using ssh mode (via rsh) because I was unable to make TCP mode
work. I don't know if this is some kind of port blocking restriction on
my machine or what:
distcc[1663] (dcc_pump_sendfile) ERROR: sendfile failed: Connection
reset by peer
distcc[1663] (dcc_writex) ERROR: failed to write: Broken pipe
distcc[1663] Warning: failed to distribute source.c to host16,cpp,lzo,
running locally instead
Perhaps getting TCP mode running should be my first performance priority.
I just tried what you suggested in your last paragraph, manually
distributing compiles via rsh, and am finding that it is, as you
suspected, a little slower than distcc using pump mode. Rather than
pursue that any further, based on your comments, I would like to see if
I can get TCP pump mode working first.
Thanks,
--
Robert W. Anderson
Center for Applied Scientific Computing
Email: anderson...@llnl.gov
Tel: 925-424-2858 Fax: 925-423-8704
__
distcc mailing list http://distcc.samba.org/
To unsubscribe or change options:
https://lists.samba.org/mailman/listinfo/distcc