Henning Fehrmann <[EMAIL PROTECTED]> wrote: > Coping a big file onto all nodes in a cluster is a rather > common problem. I would have thought that there might be a > standard tool for distributing the files in an efficient way. > So far, I haven't found one.
This is what I use: http://saf.bio.caltech.edu/nettee.html The production version is pretty much what you described. The development version is more flexible, allowing processing on each data chunk, and data flow in either direction along the chain. The biggest problem with chain methods is that it is difficult to recover if something breaks in the middle during the transfer. My cluster is only 20 nodes and it has not been an issue, but on a 2000 node cluster it probably would be. It is of course also important that all of the nodes in the distribution chain have sufficient free network and CPU resources. If there are any slow nodes the whole chain will be slow since the slow nodes will be rate limiting. Regards, David Mathog [EMAIL PROTECTED] Manager, Sequence Analysis Facility, Biology Division, Caltech _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
