Will, there are some excellent responses here. I agree that moving data to local fast storage on a node is a great idea.
Regarding the NFS storage, I would look at implementing BeeGFS if you can get some new hardware or free up existing hardware. BeeGFS is a skoosh case to set up. (*) Scottish slang. Skoosh case - very easy On Sat, 23 Feb 2019 at 04:56, Raymond Wan <rwan.w...@gmail.com> wrote: > > Hi Will, > > > On 23/2/2019 1:50 AM, Will Dennis wrote: > > For one of my groups, on the GPU servers in their cluster, I have > provided a RAID-0 md array of multi-TB SSDs (for I/O speed) mounted on a > given path ("/mnt/local" for historical reasons) that they can use for > local scratch space. Their other servers in the cluster have a single > multi-TB spinning disk mounted at that same path. We do not manage the data > at all on this path; it's currently up to the researchers to put needed > data there, and remove the data when it is no longer needed. (They wanted > us to auto-manage the removal, but we aren't in a position to know what > data they still need or not, and "delete data if atime/mtime is older than > [...]" via cron is a bit too simplistic.) They can use that local-disk path > in any way they want, with the caveat that it's not to be used as > "permanent storage", there's no backups, and if we suffer a disk failure, > etc, we just replace with new and the old data is gone. > > > IMHO, auto-managing the data removal is a slippery slope. > If the disk space is the research group's, perhaps just let > them manage it. Whatever expiry date you put on files, > someone will come along and ask for you to change it. > > I suppose one thing you could ask them to do, if you do need > to auto-manage it, is to ask them to write scripts that > "touch" the files they've used (even if it is read only). I > guess it's up to you how involved you want to be. > > > > The other group has (at this moment) no local disk at all on their > worker nodes. They actually work with even bigger data sets than the first > group, and they are the ones that really need a solution. I figured that if > I solve the one group's problem, I also can implement on the other (and > perhaps even on future Slurm clusters we spin up.) > > > Sounds like the problem is really how willing this second > group is with purchasing additional local disk space. > (Which, to be effective, should be the same space at the > same path across all nodes. And that's assuming you have > the space on each node... The servers that I use have one > local disk for each node; there wouldn't be enough drive > bays for every research group to add a drive -- we have more > than 2 research groups.) > > > > A few other questions I have: > > - is it possible in Slurm to define more than one filesystem path (i.e, > other than "/tmp") as "TmpDisk"? > > - any way to allocate storage on a node via GRES or another method? > > > It seems there have been more useful replies since. But > about your first question, I think I can answer on behalf of > the computers I use. I don't believe "/tmp" has been > specifically set as the "TmpDisk" in the SLURM > configuration. We have Unix-level read/write access to it. > We can also "cd" over to our NFS mounted home directories > when we run our programs (at the top of the SLURM submitted > script). > > In that sense, our system administrators gave us the freedom > to choose. But on the downside, they never did any > profiling and gave suggestions such as running programs on > local disk. > > Anyway, they just allocated some space on the local disk as > /tmp. I didn't mean that it was specifically configured as > TmpDisk, as far as I know. > > Good luck! > > Ray > > > >