On 07/14/2014 01:01 AM, Cowe, Malcolm J wrote:
> Since I am working with Lustre and with very large (petascale) file systems,
> I want to be able to exploit Lustre's find and changelogs features, but I
could use FPart to do the actual work.

I understand. In our simpler setup, we were only accessing/moving data through
NFS. We couldn't avoid the non-clever part of initial linear scanning of the
filesystem (although we could have used the robinhood scanning algorithm :) )

> One could also use a job scheduler to keep track of jobs that fail.

That's what we did. We leverage our grid engine scheduler and 10's physical
machines to submit rsync jobs :

- only one machine scans the filesystem with fpart
- fpart writes files listing (no more than 10000 files or 10GB in our case) in a
shared directory
- we submit rsync jobs that consume (through the --files-from rsync option)
those files listing

Besides the ugly shell wrapper involved, this was quite satisfactory to use the
in place infrastructure.

Good luck !

Jean-Baptiste


------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck®
Code Sight™ - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to