> On Apr 12, 2016, at 4:49 PM, Mark Hahn <h...@mcmaster.ca> wrote: > > Our problem seems to correlate with an unintentional creation of a tree of > >500M files. Some of the crashes we've had since then appeared > to be related to vm.zone_reclaim_mode=1. We also enabled quotas right after > the 500M file thing, and were thinking that inconsistent > quota records might cause this sort of crash.
Have you set vm.zone_reclaim_mode=0 yet? I had an issue with this on my file system a while back when it was set to 1. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org