On Tue, Nov 17, 2020 at 01:53:43PM +0000, Jonathan Buzzard wrote: > On 17/11/2020 11:51, Andi Christiansen wrote: > > Hi all, > > > > thanks for all the information, there was some interesting things > > amount it.. > > > > I kept on going with rsync and ended up making a file with all top > > level user directories and splitting them into chunks of 347 per > > rsync session(total 42000 ish folders). yesterday we had only 14 > > sessions with 3000 folders in each and that was too much work for one > > rsync session.. > > Unless you use something similar to my DB suggestion it is almost inevitable > that some of those rsync sessions are going to have issues and you will have > no way to track it or even know it has happened unless you do a single final > giant catchup/check rsync. > > I should add that a copy of the sqlite DB is cover your backside protection > when a user pops up claiming that you failed to transfer one of their > vitally important files six months down the line and the old system is > turned off and scrapped.
That's not a bad idea, and I like it more than the method I setup where we captured the output of find from both sides of the transfer and preserved it for posterity, but obviously did require a hard-stop date on the source. Fortunately, we seem committed to GPFS so it might be we never have to do another bulk transfer outside of the filesystem... -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss