[slurm-users] Is SWAP memory mandatory for SLURM
Dear All, Good morning I do have a 4 node SLURM instance up and running. Like to know if I disable the SWAP memory, will it effect the SLURM performance Is SWAP a mandatory requirement, I have each node more RAM, if my phsicall RAM is more, is there any need for the SWAP thanks Joseph John -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: URL for how to do for SLURM accounting setup
Thanks a lotWe were able to setup the SLURM accounting , now I can see the tables being populated MariaDB [slurm_acct_db]> show tables; +-+ | Tables_in_slurm_acct_db | +-+ | acct_coord_table | | acct_table | | clus_res_table | | cluster_table | | convert_version_table | | federation_table | | qos_table | | res_table | | table_defs_table | | tres_table | | txn_table | | user_table | +-+ 12 rows in set (0.000 sec) Now we will try to try out with some demo work and see how it is working Will update to the group of our exploration feedback Thanks a lot Joseph John On Friday, 16 February, 2024 at 11:02:07 am GST, Ole Holm Nielsen via slurm-users wrote: On 2/16/24 07:01, John Joseph via slurm-users wrote: > we were able to setup a test SLURM based system, with 4 nodes , Ubuntu > 22.04 LTS and we were able to run COMSOL using "comsol batch" command > Now we plan to have accounting > > https://slurm.schedmd.com/accounting.html > <https://slurm.schedmd.com/accounting.html> > > > Like to reach out and get guidance on any tutorial or how to do > documentation on setting up accounting You might take a look at our Wiki page https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/ and the other pages in the Wiki. IHTH, Ole -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] URL for how to do for SLURM accounting setup
Dear ALL, Good morning we were able to setup a test SLURM based system, with 4 nodes , Ubuntu 22.04 LTS and we were able to run COMSOL using "comsol batch" command Now we plan to have accounting https://slurm.schedmd.com/accounting.html Like to reach out and get guidance on any tutorial or how to do documentation on setting up accounting Appreciate your support Thanks Joseph John -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
Re: [slurm-users] How to check the bench mark capacity of the SLURM setup
Hi Ole, Thanks for the mail, and sorry for not properly explaining what info I was requesting, what actually I meant was that how could we could do a check how the HPC system I set is working. Eg a program which can be run individually on a node, and comparing how the same code performed under the slurm system, Hope I was able to explain more clearly on my request And sorry once again of not posting the email more clearly Thanks Joseph John On Wednesday, 13 December, 2023 at 12:34:51 pm GST, Ole Holm Nielsen wrote: On 12/13/23 07:13, John Joseph wrote: > We have setup of slurm setup for a HPC setup of 4 node > We want to do a stress test , guidnace requested for getting a code which > can test the functionality of the SLURM efficiency. If there is such a > program, like to try out > Guidance requested Then please define clearly your question. The Slurm resource manager is very efficient at handling hundreds or thousands of compute nodes with good functionality! /Ole
[slurm-users] How to check the bench mark capacity of the SLURM setup
Dear All, Good morning We have setup of slurm setup for a HPC setup of 4 node We want to do a stress test , guidnace requested for getting a code which can test the functionality of the SLURM efficiency. If there is such a program, like to try out Guidance requestedThanks Joseph john
[slurm-users] Disabling SWAP space will it effect SLURM working
Dear All, Good morning We have 4 node [256 GB Ram in each node] SLURM instance with which we installed and it is working fine. We have 2 GB of SWAP space on each node, for some purpose to make the system in full use want to disable the SWAP memory, Like to know if I am disabling the SWAP partition will it efffect SLURM functionality . Advice requestedThanks Joseph John
Re: [slurm-users] SLURM new user query, does SLURM has GUI /Web based management version also
THANKS a lot I will try out and post my experience in the group On Tuesday, 28 November, 2023 at 02:13:23 pm GST, Josef Dvoracek wrote: > can you please advice me on the monitoring tools, I I'm _somehow_ satisfied with: Prometheus Slurm exporter - ( https://github.com/vpenso/prometheus-slurm-exporter), being grabbed by Telegraf - ( https://www.influxdata.com/time-series-platform/telegraf ) sending metrics to InfluxDB. Visualisation is done by Grafana. HTH and I'll be happy to hear about other ways, especially collectors, providing consistent state of slurm and its partition usage.. Mentioned exporter is not capturing well transitions, and aggregating visualizations have glitches. But for high-level overview / quantitative view at system it's enough. josef On 26. 11. 23 8:00, John Joseph wrote: On Sunday, 19 November, 2023 at 02:35:10 pm GST, Ole Holm Nielsen wrote: On 19-11-2023 09:11, Joseph John wrote: > I am new user, trying out SLURM > > Like to check if the SLURM has a GUI/web based management tool also >Did you read the Quick Start Administrator Guide at >https://slurm.schedmd.com/quickstart_admin.html ? >I don't believe there are any Slurm management tools as a web GUI, and >that would probably be a security nightmare anyway because privileged >system access is required. There are a number of monitoring tools for viewing the status of Slurm jobs. Thanks , can you please advice me on the monitoring tools, I have been trying Zabbix, but with zabbix I am only able to get CPU and RAM information of each node Not able to get the SLURM activirties monitored please advice about the monitoring tools for SLURM /Ole
Re: [slurm-users] SLURM new user query, does SLURM has GUI /Web based management version also
On Sunday, 19 November, 2023 at 02:35:10 pm GST, Ole Holm Nielsen wrote: On 19-11-2023 09:11, Joseph John wrote: > I am new user, trying out SLURM > > Like to check if the SLURM has a GUI/web based management tool also >Did you read the Quick Start Administrator Guide at >https://slurm.schedmd.com/quickstart_admin.html ? >I don't believe there are any Slurm management tools as a web GUI, and >that would probably be a security nightmare anyway because privileged >system access is required. There are a number of monitoring tools for viewing the status of Slurm jobs. Thanks , can you please advice me on the monitoring tools, I have been trying Zabbix, but with zabbix I am only able to get CPU and RAM information of each nodeNot able to get the SLURM activirties monitored please advice about the monitoring tools for SLURM /Ole
[slurm-users] Gres value shows null in "scontrol show node node-1" , even though "nvidia-smi" shows GPU values
Dear All, Good morning I am able to setup a 4 node SLURM system, I am using Ubuntu 22.04 and my SLUM is working, Each of the nodes we have GPU cards, and I am abble to see the information of GPU using “Nvidia-smi” but when I check for “scontrol show node-1”, not able to see any entry for “Grey” , “Gres” valuses shows as null, also in the “CfgTRES” entry also not showing the gpu based entry , I am pasting my reulsts of “scontrol show node-1” , “slurmd -C” amd “nvidia-smi” here for reference "scontrol show node node-1" NodeName=node-1 Arch=x86_64 CoresPerSocket=1 CPUAlloc=0 CPUTot=72 CPULoad=0.03 AvailableFeatures=(null) ActiveFeatures=(null) Gres=(null) NodeAddr=node-1 NodeHostName=node-1 Version=21.08.5 OS=Linux 6.2.0-37-generic #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2 RealMemory=773685 AllocMem=0 FreeMem=770972 Sockets=72 Boards=1 State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A Partitions=debug BootTime=2023-11-23T09:06:28 SlurmdStartTime=2023-11-23T09:07:39 LastBusyTime=2023-11-23T09:07:40 CfgTRES=cpu=72,mem=773685M,billing=72 AllocTRES= CapWatts=n/a CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s root@node-1:~# slurmd -C NodeName=node-1 CPUs=72 Boards=1 SocketsPerBoard=2 CoresPerSocket=18 ThreadsPerCore=2 RealMemory=773685 UpTime=0-23:48:41 root@node-1:~# nvidia-smi Fri Nov 24 08:55:50 2023 +---+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-+--+--+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=+==+==| | 0 Tesla V100-PCIE-16GB Off | :06:00.0 Off | 0 | | N/A 26C P0 23W / 250W | 4MiB / 16384MiB | 0% Default | | | | N/A | +-+--+--+ | 1 Tesla V100-PCIE-16GB Off | :86:00.0 Off | 0 | | N/A 25C P0 24W / 250W | 4MiB / 16384MiB | 0% Default | | | | N/A | +-+--+--+ +---+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |===| | 0 N/A N/A 2010 G /usr/lib/xorg/Xorg 4MiB | | 1 N/A N/A 2010 G /usr/lib/xorg/Xorg 4MiB | +---+ Request guidance on what configuration parameters I have missed out, so that I am not able to see the GPU part in "scontrol show node node-1” Thanks Joseph John
[slurm-users] SLURM , maximum scalable instance is which one
Dear All, Like to know that what is the maximum scalled up instance of SLURM so far. From which web site I can get the information of the highest scalable instance of SLURM and other popular setup using SLURM Thanks Joseph John
Re: [slurm-users] Guidance on which HPC to try our "OpenHPC or TrintyX " for novice
Thanks a lot I will stick to OpenHPC , THANKS a LOT On Tuesday, 3 October, 2023 at 05:12:02 pm GST, Renfro, Michael wrote: I’d probably default to OpenHPC just for the community around it, but I’ll also note that TrinityX might not have had any commits in their GitHub for an 18-month period (unless I’m reading something wrong). On Oct 3, 2023, at 5:51 AM, John Joseph wrote: External Email Warning This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests. Dear All, Good afternoon I would like to install and study and administer HPC, as first step planning to install one of the HPC. When I check the docs I can see OpenHPC and TrintyX both of them have slurm in built Like to get advice, which one would be better for me (have knowledge in Linux command line and administration) . Which will be easier for me to install OpenHPC or TrinityX Your guidance would help me to choose my path and much appreciated thanksJoseph John
[slurm-users] Guidance on which HPC to try our "OpenHPC or TrintyX " for novice
Dear All, Good afternoon I would like to install and study and administer HPC, as first step planning to install one of the HPC. When I check the docs I can see OpenHPC and TrintyX both of them have slurm in built Like to get advice, which one would be better for me (have knowledge in Linux command line and administration) . Which will be easier for me to install OpenHPC or TrinityX Your guidance would help me to choose my path and much appreciated thanksJoseph John
[slurm-users] New member , introduction
Dear All, Thanks for the mailing list. Just joined the list Like to introduce myself, My Name Joseph John work as system administrator. Have been working on LINUX, but novice to HPC and slurm. Trying to learn Thanks Joseph John
[slurm-users] Hi, from new user
Hi All, Good afternoon I am Joseph John, just started working on OpenHPC and through it slurm,just a noviceJust joined and wanted to introduce myself and say *Hi* to all the members Thanks Joseph John