[slurm-users] Cannot run interactive jobs

2020-03-24 Thread Sajesh Singh
CentOS 7.7.1908 Slurm 18.08.8 When trying to run an interactive job I am getting the following error: srun: error: task 0 launch failed: Slurmd could not connect IO Checking the log file on the compute node I see the following error: [2020-03-25T01:42:08.262] launch task 13.0 request from UID:1

Re: [slurm-users] Running an MPI job across two partitions

2020-03-24 Thread Chris Samuel
On 23/3/20 8:32 am, CB wrote: I've looked at the heterogeneous job support but it creates two-separate jobs. Yes, but the web page does say: # By default, the applications launched by a single execution of # the srun command (even for different components of the # heterogeneous job) are combi

[slurm-users] Slurm - Maridb error

2020-03-24 Thread Dhumal, Dr. Nilesh
Hello, I am installing slurm on centos . I installed all supporting libraries successfully. I also installed MariDB before installing slurm. I get the following error for sudo rpmbuild -ta slurm-20.02.0.tar.bz2 error: File not found: /root/rpmbuild/BUILDROOT/slurm-20.02.0-1.el7.x86_64/usr/lib64/

Re: [slurm-users] Slurm Perl API use and examples

2020-03-24 Thread Burian, John
Thanks, Yair and Thomas. I’ll check out wrappers. My interest in this case is primarily in job submission and control. I was hoping that by using an API into Slurm, I would avoid problems I’ve had in the past, with interpreting inconsistent exit codes of command line executables, and parsing out

Re: [slurm-users] Slurm Perl API use and examples

2020-03-24 Thread Marcus Wagner
In fact, we ARE using the perl API, but there are some flaws. E.g. the array_task_str of the jobinfo structure. Slurm abbreviates long list of array indices, like scontrol does: e.g. 1-3,5-8,45-... yes, you can really find there three dots. In my opinion, this is ok for a general tool like s

Re: [slurm-users] Running an MPI job across two partitions

2020-03-24 Thread CB
Hi Michael, Thanks for the comment. I was just checking if there is any other way to do the job before introducing another partition. So it appears to me that creating a new partition is the way to go. Thanks, Chansup On Mon, Mar 23, 2020 at 1:25 PM Renfro, Michael wrote: > Others might have

[slurm-users] Long delay between updates of sshare

2020-03-24 Thread Pascal Klink
Hi everyone, We recently started to use a priority-based scheduling and after solving some final issues (see this post: https://groups.google.com/forum/m/#!topic/slurm-users/N8r8MoyjQAU), everything seems to be running quite smoothly now. However, we realized that the data shown by sshare, e.g

Re: [slurm-users] Slurm Perl API use and examples

2020-03-24 Thread Yair Yarom
I also haven't got along with the Perl API shipped with slurm. I got it to work, but there were things missing. Currently I have some wrapper functions for most of slurm commands, and a general parsing function to slurm's common outputs (of scontrol, sacctmgr, etc.). Not in CPAN, but you can see it

Re: [slurm-users] Accounting Information from slurmdbd does not reach slurmctld

2020-03-24 Thread Pascal Klink
Hi Sean, Hi Marcus, Changing from localhost to the actual IP seems to have solved the problem. Is that because not only the slurmctld process on the control node but also the slurmd processes on the compute nodes need to have access to the accounting information? Because although slurmdbd and