[slurm-users] (no subject)

2023-11-05 Thread mohammed shambakey
I'm having a hard time figuring out the distribution of jobs between 2 clusters in a Slurm multi-cluster environment. The documentation says that each job is submitted to the cluster that provides the earliest start time, and once the task is submitted to a cluster, it can't be re-distributed to

Re: [slurm-users] (no subject)

2022-07-28 Thread GRANGER Nicolas
I have no experience with this, but based on my understanding of the doc, the shutdown command should be something like "ssh ${node} systemctl shutdown", and the resume "ipmitool -I lan -H ${node}-bmc -U -f password_file.txt chassis power on ". If you use libvirt for your virtual cluster, you

Re: [slurm-users] (no subject)

2022-07-28 Thread Benson Muite
On 7/28/22 18:49, Djamil Lakhdar-Hamina wrote: I am helping set up a 16 node cluster computing system, I am not a system-admin but I work for a small firm and unfortunately have to pick up needed skills fast in things I have little experience in. I am running Rocky Linux 8 on Intel Xeon

[slurm-users] (no subject)

2022-07-28 Thread Djamil Lakhdar-Hamina
I am helping set up a 16 node cluster computing system, I am not a system-admin but I work for a small firm and unfortunately have to pick up needed skills fast in things I have little experience in. I am running Rocky Linux 8 on Intel Xeon Knights Landings nodes donated by the TAAC center. We are

[slurm-users] (no subject)

2022-06-09 Thread Manchang Yip
Hello Slurm Users, I am experimenting with the new --prefer soft constraint option in 22.05. The option behaves as described, but is somewhat inefficient if many jobs with different --prefer options are submitted. Here is the scenario: 1. submit array of 100 tasks preferring feature A, each

[slurm-users] (no subject)

2021-09-01 Thread pravin
Dear all, I have copied the user file from windows and did not covert it using dos2unix and using a shell script to add the user and account to the slurm but I am facing the problem and the output of the sshare command as below- [root@master01]# sshare -a Account User

Re: [slurm-users] (no subject)

2021-07-30 Thread Chris Samuel
On Friday, 30 July 2021 11:21:19 AM PDT Soichi Hayashi wrote: > I am running slurm-wlm 17.11.2 You are on a truly ancient version of Slurm there I'm afraid (there have been 4 major releases & over 13,000 commits since that was tagged in January 2018), I would strongly recommend you try and get

[slurm-users] (no subject)

2021-07-30 Thread Soichi Hayashi
Hello. I need a help with troubleshooting our slurm cluster. I am running slurm-wlm 17.11.2 on Ubuntu 20 on a public cloud infrastructure (Jetstream) using an elastic computing mechanism ( https://slurm.schedmd.com/elastic_computing.html). Our cluster works for the most part, but for some reason,

Re: [slurm-users] (no subject)

2019-12-08 Thread Ole Holm Nielsen
Forgot the link to the Wiki: https://wiki.fysik.dtu.dk/niflheim/SLURM On 12/8/19 9:18 PM, Ole Holm Nielsen wrote: Hi Dean, You may want to look at the links in my Slurm Wiki page.  Both the official Slurm documentation and other resources are listed.  I think most of your requirements and

Re: [slurm-users] (no subject)

2019-12-08 Thread Ole Holm Nielsen
Hi Dean, You may want to look at the links in my Slurm Wiki page. Both the official Slurm documentation and other resources are listed. I think most of your requirements and questions are described in these pages. My Wiki gives detailed deployment information for a CentOS 7 cluster, but

[slurm-users] (no subject)

2019-09-04 Thread Tina Fora
Hi, I'm adding a bunch of memory on two of our nodes that are part of a blade chassis. So two computes will be upgraded to 1TB RAM and the rest have 192GB. All of the nodes belog to several partitons and can be used by our paid members given the partition below. I'm looking for ways to figure out

[slurm-users] (no subject)

2018-04-11 Thread Mike Renfro
Hey, folks. I have a relatively simple queueing setup on Slurm 17.02 with a 1000 CPU-day AssocGrpCPURunMinutesLimit set. When the cluster is less busy than typical, I may still have users run up against the 1000 CPU-day limit, even though some nodes are idle. What’s the easiest way to force a job