Wow. I did not catch that version issue. I saw that there were issues with the 
newest Slurm and how CUDA 10+ installs so I avoided that even though we have 
CUDA 8. I did have Slurm 19 downloaded so I'm thinking I ran into an issue with 
that and went back to 18 but now that I have more experience setting it up I'll 
wipe the 18 install and start over. Fingers crossed for success!

Thanks for your help!

--
Lisa Weihl 
Systems Administrator, Computer Science 
Bowling Green State University
Tel: (419) 372-0116   |    Fax: (419) 372-8061
lwe...@bgsu.edu
www.bgsu.edu

-----Original Message-----
From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of 
slurm-users-requ...@lists.schedmd.com
Sent: Thursday, April 16, 2020 6:39 PM
To: slurm-users@lists.schedmd.com
Subject: [EXTERNAL] slurm-users Digest, Vol 30, Issue 32

Send slurm-users mailing list submissions to
        slurm-users@lists.schedmd.com

To subscribe or unsubscribe via the World Wide Web, visit
        
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.schedmd.com%2Fcgi-bin%2Fmailman%2Flistinfo%2Fslurm-users&amp;data=02%7C01%7Clweihl%40bgsu.edu%7C51ded050bd424dc6ba8908d7e256fdad%7Ccdcb729d51064d7cb75ba30c455d5b0a%7C1%7C0%7C637226735569993045&amp;sdata=D782Wwobcc6ezSuy5GipiXuiH7EKRMm5Llk3BRwYnss%3D&amp;reserved=0
or, via email, send a message with subject or body 'help' to
        slurm-users-requ...@lists.schedmd.com

You can reach the person managing the list at
        slurm-users-ow...@lists.schedmd.com

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of slurm-users digest..."


Today's Topics:

   1. CentOS 7 CUDA 8.0 can't find plugin cons_tres (Lisa Kay Weihl)
   2. Re: [EXTERNAL] CentOS 7 CUDA 8.0 can't find plugin cons_tres
      (Sean Crosby)


----------------------------------------------------------------------

Message: 1
Date: Thu, 16 Apr 2020 19:00:03 +0000
From: Lisa Kay Weihl <lwe...@bgsu.edu>
To: "slurm-users@lists.schedmd.com" <slurm-users@lists.schedmd.com>
Subject: [slurm-users] CentOS 7 CUDA 8.0 can't find plugin cons_tres
Message-ID:
        
<dm5pr05mb29056be0862db04aa8960355b0...@dm5pr05mb2905.namprd05.prod.outlook.com>
        
Content-Type: text/plain; charset="utf-8"

I have a standalone server with 4 GeForce RTX 2080 Ti. The purpose is to serve 
as a computer server for data science jobs. My department chair wants a job 
scheduler on it. I have installed SLURM (18.08.9). That works just fine in a 
basic configuration when I attempt to add Gres_Types gpu and then add 
Gres:gpu:4 to the end of the node description:


NodeName=cs-datasci CPUs=24 RealMemory=385405 Sockets=2 CoresPerSocket=6 
ThreadsPerCore=2 State=UNKNOWN Gres=gpu:4

and then try to restart slurmd I get an error that it cannot find the plugin

slurmd: error: Couldn't find the specified plugin name for select/cons_tres 
looking at all files

slurmd: error: cannot find select plugin for select/cons_tres

slurmd: fatal: Can't find plugin for select/cons_tres

The system was prebuilt by AdvancedHPC with CentOS 7 and CUDA 8.0

I usually keep notes when I'm installing things but in this case I wasn't 
jotting things down as I went. I think I started with the instructions on this 
page: 
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fquickstart_admin.html&amp;data=02%7C01%7Clweihl%40bgsu.edu%7C51ded050bd424dc6ba8908d7e256fdad%7Ccdcb729d51064d7cb75ba30c455d5b0a%7C1%7C0%7C637226735569993045&amp;sdata=0%2BjmfxFqNhRQBC50zbeG5g5EO6pi2n5We9vPt6WGyHs%3D&amp;reserved=0
 and went with the usual ./configure, make, make install.

I have a feeling maybe something did not work and I switched to the rpm 
packages based on some other web pages I saw because if I do a yum list 
installed | grep slurm I see a lot of pacakages. The problem is I was 
interrupted with other tasks and my memory was somewhat rusty when I came back 
to this.

When I went looking for this error I saw there were some issues with the newest 
SLURM and CUDA 10.2 but I didn't think that should be an issue because I was at 
CUDA 8.0.  Just in case I backed down to SLURM 18.

I'm willing to start all over if anyone thinks cleaning up and rebuilding will 
help that. I do see libraries in /etc/lib64/slurm but I also see 2 files in 
/usr/local/lib/slurm/src so I'm not sure if that's left over from trying to 
install from source.  All the daemons are in /usr/sbin and user commands in 
/usr/bin

I'm a newbie at this and very frustrated. Can anyone help?

***************************************************************

Lisa Weihl Systems Administrator

Computer Science, Bowling Green State University
Tel: (419) 372-0116   |    Fax: (419) 372-8061
lwe...@bgsu.edu
http://www.bgsu.edu/?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.schedmd.com%2Fpipermail%2Fslurm-users%2Fattachments%2F20200416%2F450a069d%2Fattachment-0001.htm&amp;data=02%7C01%7Clweihl%40bgsu.edu%7C51ded050bd424dc6ba8908d7e256fdad%7Ccdcb729d51064d7cb75ba30c455d5b0a%7C1%7C0%7C637226735569993045&amp;sdata=D8CwIzZ2C0lYQQn%2BEtFE4%2FHgSVdStiSjO2%2F0tZ3snHk%3D&amp;reserved=0>

------------------------------

Message: 2
Date: Fri, 17 Apr 2020 08:38:27 +1000
From: Sean Crosby <scro...@unimelb.edu.au>
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] [EXTERNAL] CentOS 7 CUDA 8.0 can't find
        plugin cons_tres
Message-ID:
        <CAFstPEBO5+MthqskkP8dbo6Vvy8=f8yrczbxanwzmz1qdx3...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi Lisa,

cons_tres is part of Slurm 19.05 and higher. As you are using Slurm 18.08, it 
won't be there. The select plugin for 18.05 is cons_res.

Is there a reason why you're using an old Slurm?

Sean
--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead Research Computing 
Services | Business Services The University of Melbourne, Victoria 3010 
Australia



On Fri, 17 Apr 2020 at 05:00, Lisa Kay Weihl <lwe...@bgsu.edu> wrote:

> *UoM notice: External email. Be cautious of links, attachments, or 
> impersonation attempts.*
> ------------------------------
> I have a standalone server with 4 GeForce RTX 2080 Ti. The purpose is 
> to serve as a computer server for data science jobs. My department 
> chair wants a job scheduler on it. I have installed SLURM (18.08.9). 
> That works just fine in a basic configuration when I attempt to add 
> Gres_Types gpu and then add Gres:gpu:4 to the end of the node description:
>
> NodeName=cs-datasci CPUs=24 RealMemory=385405 Sockets=2 
> CoresPerSocket=6
> ThreadsPerCore=2 State=UNKNOWN Gres=gpu:4
>
> and then try to restart slurmd I get an error that it cannot find the 
> plugin
>
> slurmd: error: Couldn't find the specified plugin name for 
> select/cons_tres looking at all files
>
> slurmd: error: cannot find select plugin for select/cons_tres
>
> slurmd: fatal: Can't find plugin for select/cons_tres
>
> The system was prebuilt by AdvancedHPC with CentOS 7 and CUDA 8.0
>
> I usually keep notes when I'm installing things but in this case I 
> wasn't jotting things down as I went. I think I started with the 
> instructions on this page: 
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fquickstart_admin.html&amp;data=02%7C01%7Clweihl%40bgsu.edu%7C51ded050bd424dc6ba8908d7e256fdad%7Ccdcb729d51064d7cb75ba30c455d5b0a%7C1%7C0%7C637226735569993045&amp;sdata=0%2BjmfxFqNhRQBC50zbeG5g5EO6pi2n5We9vPt6WGyHs%3D&amp;reserved=0
>  and went with the usual ./configure, make, make install.
>
> I have a feeling maybe something did not work and I switched to the 
> rpm packages based on some other web pages I saw because if I do a yum 
> list installed | grep slurm I see a lot of pacakages. The problem is I 
> was interrupted with other tasks and my memory was somewhat rusty when 
> I came back to this.
>
> When I went looking for this error I saw there were some issues with 
> the newest SLURM and CUDA 10.2 but I didn't think that should be an 
> issue because I was at CUDA 8.0.  Just in case I backed down to SLURM 18.
>
> I'm willing to start all over if anyone thinks cleaning up and 
> rebuilding will help that. I do see libraries in /etc/lib64/slurm but 
> I also see 2 files in /usr/local/lib/slurm/src so I'm not sure if 
> that's left over from trying to install from source.  All the daemons 
> are in /usr/sbin and user commands in /usr/bin
>
> I'm a newbie at this and very frustrated. Can anyone help?
>
> ***************************************************************
>
> Lisa Weihl *Systems Administrator*
>
>
> *Computer Science, Bowling Green State University *Tel: (419) 372-0116
> |    Fax: (419) 372-8061
> lwe...@bgsu.edu
> http://www.bgsu.edu/?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.schedmd.com%2Fpipermail%2Fslurm-users%2Fattachments%2F20200417%2Facda81ed%2Fattachment.htm&amp;data=02%7C01%7Clweihl%40bgsu.edu%7C51ded050bd424dc6ba8908d7e256fdad%7Ccdcb729d51064d7cb75ba30c455d5b0a%7C1%7C0%7C637226735569993045&amp;sdata=KuHeR2ewb8Qx68c3bB3H8RSQwEPiyVvNGjpYUmdvRrg%3D&amp;reserved=0>

End of slurm-users Digest, Vol 30, Issue 32
*******************************************

Reply via email to