> AFAIK, the fs.file-max limit is a node-wide limit, whereas "ulimit -n"
> is per user.
The ulimit is a frontend to rusage limits, which are per-process restrictions
(not per-user).
The fs.file-max is the kernel's limit on how many file descriptors can be open
in aggregate. You'd have to edit
https://github.com/dun/munge/issues/94
The NEWS file claims this was fixed in 0.5.15. Since your log doesn't show the
additional strerror() output you're definitely running an older version,
correct?
If you go on one of the affected nodes and do an `lsof -p ` I'm
betting you'll find a long
The native job_container/tmpfs would certainly have access to the job record,
so modification to it (or a forked variant) would be possible. A SPANK plugin
should be able to fetch the full job record [1] and is then able to inspect the
"gres" list (as a C string), which means I could modify
Most of my ideas have revolved around creating file systems on-the-fly as part
of the job prolog and destroying them in the epilog. The issue with that
mechanism is that formatting a file system (e.g. mkfs.) can be
time-consuming. E.g. formatting your local scratch SSD as an LVM PV+VG and