[slurm-dev] Re: Error: Unable to contact slurm controller
No, slurmctld isn't running. Now. It was when I started, but I suspect I made at least one mod too many to slurm.conf. When I try to start slurmctld, I get these in slurmctld.log: [2014-08-21T09:30:09.626] debug2: No ApbasilTimeout configured (65534) [2014-08-21T09:30:09.630] debug2: No ApbasilTimeout configured (65534) [2014-08-21T09:30:09.673] fatal: system has no usable batch compute nodes I've just made a mod to slurm.conf that makes sure there's a default partition. I'd had named partitions in previously, but got some errors and warnings when trying to get the partition naming right in #SBATCH, so I'd gone back to the default config. This appears to have started with a reboot several days ago. I'm now making sure it's not something deeper causing a Gemini network problem. Thanks, Trey! gerry On Wed, Aug 20, 2014 at 10:11 PM, Trey Dockendorf treyd...@tamu.edu wrote: Is slurmctld running? My guess is that you need at least one partition defined in addition to the DEFAULT partition. Try creating a partition with any name, which will inherit everything from DEFAULT. - Trey = Trey Dockendorf Systems Analyst I Texas AM University Academy for Advanced Telecommunications and Learning Technologies Phone: (979)458-2396 Email: treyd...@tamu.edu Jabber: treyd...@tamu.edu - Original Message - From: Gerry Creager - NOAA Affiliate gerry.crea...@noaa.gov To: slurm-dev slurm-dev@schedmd.com Sent: Wednesday, August 20, 2014 4:40:40 PM Subject: [slurm-dev] Re: Error: Unable to contact slurm controller Hi, Trey That's what I am intuiting, as well, but: gerry@loki:~/software/wrf/NME/DART_Lanai/models/wrf/work egrep '^(PartitionName|NodeName)' /opt/slurm/default/etc/slurm.conf NodeName=nid00[002-007,024-029,040-043,046-049,052-055,064-071,088-099,100-103,120-127,136-151,160-167,184-199,216-223,232-247,256-263,280-287] Sockets=4 CoresPerSocket=4 ThreadsPerCore=1 RealMemory=65536 PartitionName=DEFAULT Shared=EXCLUSIVE State=UP DefaultTime=60 Nodes=nid00[002-007,024-029,040-043,046-049,052-055,064-071,088-099,100-103,120-127,136-151,160-167,184-199,216-223,232-247,256-263,280-287] MaxNodes=12 looks pretty normal. gerry On Wed, Aug 20, 2014 at 4:25 PM, Trey Dockendorf treyd...@tamu.edu wrote: What's your slurm.conf look like? Do you have valid Nodes and Partitions defined? For example: egrep '^(PartitionName|NodeName)' /etc/slurm/slurm.conf Sounds like invalid slurm.conf is preventing slurmctld from starting. - Trey = Trey Dockendorf Systems Analyst I Texas AM University Academy for Advanced Telecommunications and Learning Technologies Phone: (979)458-2396 Email: treyd...@tamu.edu Jabber: treyd...@tamu.edu - Original Message - From: Gerry Creager - NOAA Affiliate gerry.crea...@noaa.gov To: slurm-dev slurm-dev@schedmd.com Sent: Wednesday, August 20, 2014 4:09:25 PM Subject: [slurm-dev] Re: Error: Unable to contact slurm controller Moe, Thanks. I've tried. I'm noting a pair of errors in the slurmctld.log file: 2014-08-20T15:58:58.458] debug: No DownNodes [2014-08-20T15:58:58.458] fatal: No PartitionName information available! So far, Google hasn't helped me much in this regard. gerry On Wed, Aug 20, 2014 at 11:39 AM, je...@schedmd.com wrote: Try this: http://slurm.schedmd.com/ troubleshoot.html Quoting Gerry Creager - NOAA Affiliate gerry.crea...@noaa.gov : I'm trying to learn how to use and administer slurm on a new Cray system, and started seeing this yesterday: squeue slurm_load_jobs error: Unable to contact slurm controller (connect failure) I'm at a loss as to how to proceed. Thanks, Gerry -- Gerry Creager NSSL/CIMMS 405.325.6371 ++ “Big whorls have little whorls, That feed on their velocity; And little whorls have lesser whorls, And so on to viscosity.” Lewis Fry Richardson (1881-1953) -- Morris Moe Jette CTO, SchedMD LLC Slurm User Group Meeting September 23-24, Lugano, Switzerland Find out more http://slurm.schedmd.com/ slurm_ug_agenda.html -- Gerry Creager NSSL/CIMMS 405.325.6371 ++ “Big whorls have little whorls, That feed on their velocity; And little whorls have lesser whorls, And so on to viscosity.” Lewis Fry Richardson (1881-1953) -- Gerry Creager NSSL/CIMMS 405.325.6371 ++ “Big whorls have little whorls, That feed on their velocity; And little whorls have lesser whorls, And so on to viscosity.” Lewis Fry Richardson (1881-1953) -- Gerry Creager NSSL/CIMMS 405.325.6371 ++ “Big whorls have little whorls, That feed on their
[slurm-dev] Storing the job submission script in the accounting database
Is it possible to store the job submission script and the environment variables passed to it in the account database or log this data automatically to /path/to/spylog/SLURM_JOB_ID.log files in SLURM? I'm interested in analysing what the cluster is used for over time and this would be a good start in working out what is really being submitted. Thanks Antony
[slurm-dev] Intel MPI Performance inconsistency (and workaround)
Slurmites, We recently noticed sporadic performance inconsistencies on one of our clusters. We discovered that if we restarted slurmd in an interactive shell, we observed correct performance. To track down the cause, we ran: (1) single-node linpack (2) dual node mp_linpack (3) mpptest On affected nodes, Linpack performance was normal and mp_linpack was about 85% as high as expected. mpptest, which measures MPI performance, was our smoking gun. Latencies to be 10x higher than expected (~20us instead of 2us). We were able to consistently reproduce the issue with freshly imaged or freshly rebooted nodes. Upon restarting slurmd on each execution node manually, MPI latencies immediately improved to the expected 2us for our set of tested nodes. The cluster is under fairly heavy use right now so we don't have the luxury of diagnosing this thoroughly and determining the cause. We wanted to share this experience with others in case it can help other users or if any slurm developers would like us to file a bug report and be interested in gathering further information. Best, Jesse Stroik University of Wisconsin
[slurm-dev] Re: Account / partition association on heterogeneous clusters
We ended up working around our needs by writing a program that provided users with the appropriate settings. It may be something to consider for future releases of slurm to be able to automatically use any available and valid account given a user-partition request, or allow administrators to set a default account for each user-partition combination. Best, Jesse Stroik University of Wisconsin On 8/13/2014 12:43 PM, Jesse Stroik wrote: Our cluster has two primary groups of users. The users groups each have a different account from which we designate shares and for which we provide accounting information. We are in the process of adding nodes for which CPU time has a very different practical value to the end users. If users used these nodes their shares provide them less value. To mitigate this, we've created a new partition ('amd') and we've set up an additional account for each group. group1 group2 group1-amd group2-amd Each user has multiple associations. For example: group1 ourcluster 9000 group2 ourcluster 1000 group1 ourcluster alice 100 regular-partition group1-amd ourcluster alice 100 amd-partition group2 ourcluster bob 100 regular-partition group2-amd ourcluster bob 100 amd-partition One issue with this is users who can only use one partition get a smaller share of the total system. Another is that we cannot set a default account per user-partition combination. For example, if Alice wants to submit to --partition amd-partition, she must also specify -A group1-amd or she will get an invalid user/partition error. Is there a better way to do this? We don't see a way to allow SLURM to search the association tables for a valid account for the user/partition combination. Best, Jesse Stroik University of Wisconsin
[slurm-dev] Re: Intel MPI Performance inconsistency (and workaround)
Hi Jesse, Just a shot in the dark, but do you use task affinity or CPU binding? Cheers, -- Kilian
[slurm-dev] Re: Intel MPI Performance inconsistency (and workaround)
Yes, but we aren't specifying it for all of these jobs. In the config we have: --- TaskPlugin=task/affinity TaskPluginParam=Sched SelectTypeParameters=CR_CPU_Memory,CR_CORE_DEFAULT_DIST_BLOCK --- And we typically suggest --cpu_bind=core --distribution=block:block for srun in our documentation. However, we did not specify --cpu_bind or --distribution as arguments to the job for mpptest or for mp_linpack. And we noticed that despite the CR_CORE_DEFAULT_DIST_BLOCK setting, we still needed to specify --distribution=block:block for our binding to be correct for OpenMP+MPI hybrid jobs. Best, Jesse On 8/21/2014 2:14 PM, Kilian Cavalotti wrote: Hi Jesse, Just a shot in the dark, but do you use task affinity or CPU binding? Cheers,
[slurm-dev] Re: Intel MPI Performance inconsistency (and workaround)
On 22/08/14 04:43, Jesse Stroik wrote: We recently noticed sporadic performance inconsistencies on one of our clusters. What distro is this? Are you using cgroups? cheers, Chris -- Christopher SamuelSenior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci