Hi Tom, On Wed, Jun 11, 2014 at 12:03 PM, Tom Harvill <[email protected]> wrote: > I want to ask this general question: how does your shop deal with the > general problem of > small files in filesystems on (beowulf) compute clusters? Specifically, > files that users expect > to actively use for read and write operations for their research. > > Do you distinguish and segregate them (and/or the people that use them) on > special > hardware/filesystems?
Segregating small files on their own filesystem could be an idea. You could also enforce usage quotas on inodes, so that their overall number stays in reasonable ranges. Other than that, you could also use Robinhood (http://robinhood.sf.net) to track and monitor your filesystem usage. It's especially developed for Lustre (although you can use it on any kind of POSIX filesystem) and can take advantage of Lustre changelogs for an always up-to-date view of your filesystems. It's a Policy Engine, so you can define file classes and actions or alerts that you can trigger on specific conditions (for instance if a directory contains more than a certain amount of files). That can be very handy to determine if users are within your site best practices or not, and to help them adapt their workflow if needed. See: http://www.hpcwire.com/off-the-wire/cea-releases-robinhood-2-5/ and http://opensfs.org/wp-content/uploads/2013/04/lug13-robinhood.pdf All those solutions require the same level of communication and user education, though. :) Cheers, -- Kilian _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
