[slurm-users] Using slurm_submit_batch_job API

2020-01-30 Thread Cao, Lei
Can I use slurm_submit_sbatch_job API to submit a batch job by only giving a complete job script to the job_desc_msg_t data structure? For example (pseudo code): job_desc_msg_t my_job; submit_response_msg_t *resp = NULL; char sbatch_script[4096] = "#SBATCH x\n mpirun

[slurm-users] Burst buffer plugin roadmap

2019-11-01 Thread Cao, Lei
Hi, We have been using the burst buffer plugin to build our own staging layer at LANL, and we are wondering if there will be any big changes to the burst buffer plugin in the future? Thanks Lei

[slurm-users] MPI job fails with more than 1 node: "Failed to send temp kvs to compute nodes"

2019-07-15 Thread Cao, Lei
Hi, I am running slurm version 19.05.0 and openmpi version 3.1.4. Openmpi is configured with pmi2 from slurm. Whenever I tried to run an mpi job with more than 1 node, I have this error message: srun: error: mpi/pmi2: failed to send temp kvs to compute nodes srun: Job step aborted:

Re: [slurm-users] Resource sharing between different clusters

2018-10-19 Thread Cao, Lei
of Benjamin Redling Sent: Friday, October 19, 2018 2:41:51 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] Resource sharing between different clusters On 18/10/2018 18:16, Cao, Lei wrote: > I am pretty new to slurm so please bear with me. I have the following > scenario

[slurm-users] Resource sharing between different clusters

2018-10-18 Thread Cao, Lei
Hi, I am pretty new to slurm so please bear with me. I have the following scenario and I wonder if slurm currently supports this in someway. Let's say I have 3 clusters. Cluster1 and cluster2 run their own slurmctld and slurmds(this is a hard requirement), but both of them need to