awesome, sounds about right. I’ll give it a try, thank you.
On Jul 5, 2017, at 10:58 AM, Evan Remington <erem...@mit.edu<mailto:erem...@mit.edu>> wrote: Have you considered the --wrap option of sbatch? --wrap=<command string> Sbatch will wrap the specified command string in a simple "sh" shell script, and submit that script to the slurm controller. When --wrap is used, a script name and arguments may not be specified on the command line; instead the sbatch-generated wrapper script is used. On Wed, Jul 5, 2017 at 1:52 PM Craig Yoshioka <yoshi...@ohsu.edu<mailto:yoshi...@ohsu.edu>> wrote: Also, maybe this has possibly been fixed already? Am not seeing this happen on our Slurm 17.x test cluster, but it appears on our cluster using 15.x. > On Jul 5, 2017, at 10:37 AM, Craig Yoshioka > <yoshi...@ohsu.edu<mailto:yoshi...@ohsu.edu>> wrote: > > Hi, > > I posted this a while back but didn’t get any responses. I prefer using > `srun` to invoke commands on our cluster because it is way more convenient > then writing wrappers for sbatch for running single process jobs (no multiple > steps). The problem is that if I submit to many srun jobs, the head node > starts running out of socket resources (or other?) and I start getting > timeouts and some of the srun processes start using 100% CPU. > > I’ve tried redirecting all I/O to prevent use of sockets, etc., but still see > this problem. Can anyone suggest an alternative approach or fix? Something > that doesn’t require I write shell wrappers, but also doesn’t keep a running > process going on the head node? > > Thanks, > -Craig