Have you considered the --wrap option of sbatch? *--wrap=<command string>* *Sbatch will wrap the specified command string in a simple "sh" shell script, and submit that script to the slurm controller. When --wrap is used, a script name and arguments may not be specified on the command line; instead the sbatch-generated wrapper script is used.*
On Wed, Jul 5, 2017 at 1:52 PM Craig Yoshioka <yoshi...@ohsu.edu> wrote: > Also, maybe this has possibly been fixed already? > > Am not seeing this happen on our Slurm 17.x test cluster, but it appears > on our cluster using 15.x. > > > On Jul 5, 2017, at 10:37 AM, Craig Yoshioka <yoshi...@ohsu.edu> wrote: > > > > Hi, > > > > I posted this a while back but didn’t get any responses. I prefer using > `srun` to invoke commands on our cluster because it is way more convenient > then writing wrappers for sbatch for running single process jobs (no > multiple steps). The problem is that if I submit to many srun jobs, the > head node starts running out of socket resources (or other?) and I start > getting timeouts and some of the srun processes start using 100% CPU. > > > > I’ve tried redirecting all I/O to prevent use of sockets, etc., but > still see this problem. Can anyone suggest an alternative approach or > fix? Something that doesn’t require I write shell wrappers, but also > doesn’t keep a running process going on the head node? > > > > Thanks, > > -Craig > >