Hi.  I have a mid-size Linux cluster with no shared filesystem.  I'm
trying to submit jobs from arbitrary nodes, or even from machines
that aren't part of the cluster.  I can do this successfully if the jobs
are simple 'srun hostname' type jobs (no batch command file).
 
However, if I submit a job with a batch command file, then it
seems that the command file is not automatically copied to to
the machine where it will run, and this generates an error.
 
What is the common way of handling this?  I could potentially copy
the jobs to every node in the system, but this doesn't cale well once
I have hundreds of nodes.  What's the best way to handle this?

Reply via email to