Is there a way to throttle job starts on Grid Engine (we are using Son
of Grid Engine)?
i.e. I would like to limit the number of tasks started during each
scheduling cycle and spread the startup of large array jobs over a
longer (still short) period of time. I'm aware that this would be a
tradeoff against task throughput for very short tasks.
We appear to be having some filesystem (GPFS) problems when 2000+
tasks on 350+ nodes all start creating grid engine log files in the
same directory at the same time. These tasks are often for a single
user hitting an idle system so I can't use maxujobs.
Ideally we fix the filesystem and/or network communications. I'm
looking for a workaround.
These jobs tend to have the same runtime so I'm seeing periodic floods
of simultaneous file creation. I can get the user to add some random
sleep time in the jobs to spread later jobs out, but the idle->full
spike will still exist.
Thanks,
Stuart
--
I've never been lost; I was once bewildered for three days, but never lost!
-- Daniel Boone
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users