Did some testing and see some strange results. rsync being considerably slower then cp or tee
time rsync /Network/sata3/samsara.2011.720p-50fps.m4v /Network/sata4/2632 real 6m10.834s user 0m8.299s sys 0m12.368s time cp /Network/sata3/samsara.2011.720p-50fps.m4v /Network/sata4/2632 real 3m43.190s user 0m0.005s sys 0m5.349s time tee < /Network/sata3/samsara.2011.720p-50fps.m4v > /Network/sata4/2632 /Network/sata2/2632 real 3m46.949s user 0m0.176s sys 0m20.299s another rsync to show it's not a caching influence. On OSX Activity monitor in Network I do clearly see that Data sent is about the same as Data read with rsync while with cp and tee it's about double write speed against single speed read. rm -f /Network/sata4/2632 /Network/sata2/2632;time rsync /Network/sata3/samsara.2011.720p-50fps.m4v /Network/sata4/2632 real 6m4.532s user 0m8.138s sys 0m11.395s time rsync /Network/sata3/samsara.2011.720p-50fps.m4v /Network/sata2/2632 real 6m2.038s user 0m8.142s sys 0m11.478s Source 100Mb/s rest are 3 1Gb/s systems. One system in the middle OSX, reads at 12MB/s and writes at 24MB/s with cp and tee. With rsync it only writes at ~ 11MB/s Maybe it's OSX version of rsync's fault but very strange. Henk On Sep 11, 2013, at 1:45 AM, Henk D. Schoneveld <[email protected]> wrote: > That should work, but as I understand it there will be 4 threads running. > Bandwith on source-server has to be shared between these threads. > What I'm dreaming of is some kind of broad- multicasting to 4 ip-adress to > get max throughput. > Maybe it's impossible but would be very efficient wouldn't it ? > > Henk > On Sep 10, 2013, at 9:09 PM, James Burton <[email protected]> wrote: > >> Henk, >> >> rsync will mostly do what you want it to do, but rsync doesn't support >> remote->remote copy. >> >> The way I do multi-node copies involves using a recursive copy algorithm >> that uses ssh to run rsync on the remote machines. On each pass, every node >> that has source copies to a node that doesn't, which quickly copies the >> source to all the nodes in the list. >> >> Here is the psuedocode. Of course, rsync and ssh need to be set up correctly >> on all the nodes and you have to be sure you are using the right syntax for >> your application, but this is a basic idea of what to do. >> >> copyAll( nodes[] ): >> >> # assume source is at node[0] >> >> len = nodes.length() >> >> if len == 1: return >> >> # copy the source to node in the the middle of the list. >> ssh user@node[0] "rsync -a /path/to/files user@node[len/2]:/path/to/files" >> >> # partition the list and call recursively on separate threads >> >> # this copies from node[0]->node[len/4] >> thread(copyAll(nodes[0:len/2])) >> >> # this copies from node[len/2]->node[3*len/4] >> thread(copyAll(nodes[len/2:len])) >> >> Hope that helps. >> >> Jim >> >> >> On Tue, Sep 10, 2013 at 12:36 PM, Henk D. Schoneveld <[email protected]> >> wrote: >> >> On Sep 10, 2013, at 4:56 PM, James Burton <[email protected]> wrote: >> >>> Henk, >>> >>> I'm not sure what you are trying to do. >>> >>> Are you looking to copy data from one server to a series of servers? >> Yes >>> Is this a one time copy for setup or will this be part of an ongoing system? >> It will be part of an ongoing system. >>> >>> Thanks, >>> >>> Jim >>> >>> >>> On Mon, Sep 9, 2013 at 4:25 PM, Henk D. Schoneveld <[email protected]> >>> wrote: >>> Hi everybody, >>> >>> I'm thinking about installing 5 groups of 30 pvfs2-systems in a 100Mb/s >>> WAN. The reason for this setup is that if one group would fail the >>> remaining 4 groups will be able to serve the original amount of intended >>> clients. IO-load would be 5/4 of the original setup. >>> >>> All groups share one 5Gb/s connection to the internet. >>> >>> To get minimal data transferred from the server somewhere on the internet >>> I'm thinking about following scenario. Copy a file with 30x100Mb/s = 3Gb/s >>> on 1 group and then parallel redistribute to the remaining groups. >>> >>> Any ideas how to do this most efficiently ? I know tee < source > dest0 >>> dest1 dest2 dest3 dest4 would do this but it's not recursively and >>> wildcards aren't accepted. rsync works with wildcards and recursively but >>> how to get it done parallel in a way that load on the source group is >>> minimal ? >>> >>> Suggestions ver welcome >>> >>> Henk >>> _______________________________________________ >>> Pvfs2-users mailing list >>> [email protected] >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >>> >> >> > > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
