Re: [slurm-users] Slurm and available libraries
So EasyBuild + Lmod seems the best solution. I'll try. :) Thank you all! betta 2018-01-17 17:53 GMT+01:00 Christopher Samuel: > On 18/01/18 03:50, Patrick Goetz wrote: > > Can anyone shed some light on the situation? I'm very surprised that >> a module script isn't just an explicit command that comes with the >> lmod package, and am curious as to why this isn't completely >> standard. >> > > The module command needs to be able to manipulate the environment in > the current shell, so it can't just be run as an external command, > it's output has to be evaluated in the current shell. > > It looks like the second machine you mention is using Environment > Modules, not Lmod. > > > All the best, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC > >
Re: [slurm-users] Slurm and available libraries
On 18/01/18 02:53, Loris Bennett wrote: This is all very OT, so it might be better to discuss it on, say, the OpenHPC mailing list, since as far as I can tell Spack, EasyBuild and Lmod (but not old or new 'environment-modules') are part of OpenHPC. Another place might be the Beowulf list, all about Linux HPC (started by Don Becker many moons ago), now maintained by yours truly. http://www.beowulf.org/ Happy to add people to the list if they wish, just email me directly. All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
Re: [slurm-users] Slurm and available libraries
Hi Ole, Ole Holm Nielsenwrites: > John: I would refrain from installing the old default package > "environment-modules" from the Linux distribution, since it doesn't > seem to be maintained any more. Is this still true? Here http://modules.sourceforge.net/ there is a version 4.1.0 which is two days old. Does anyone have any experience of this and how it compares to the old version and/or Lmod? > Lmod, on the other hand, is actively maintained and solves some > problems with the old "environment-modules" software. > > There's an excellent review paper on different module tools: "Modern > Scientific Software Management Using EasyBuild and Lmod", > http://dl.acm.org/citation.cfm?id=2691141 Thanks for the link. I would also be interested in how EasyBuild and Spack compare in practice. This is all very OT, so it might be better to discuss it on, say, the OpenHPC mailing list, since as far as I can tell Spack, EasyBuild and Lmod (but not old or new 'environment-modules') are part of OpenHPC. Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
Re: [slurm-users] Slurm and available libraries
Hi Bill! Always glad to contribute to the Lmod cause! ;) Back to the discussion, I simply gave my contribution based on how we set up our system. In no way I intended to say that that is the only way to deploy software. Yours is definitely a valid alternative, although it requires a deeper experience in software packaging and deployment. To solve the problem of users overloading the login nodes we are experimenting with cgroups, but here we are going a little too much off topic. PS: Now that I am in San Antonio I have no more excuses to come and visit you guys at TACC. -- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) Vanderbilt University - Hill Center 201 (615)-875-9137 www.accre.vanderbilt.edu On 2018-01-17 08:01:10-06:00 slurm-users wrote: I’d go slightly further, though I do appreciate the Lmod shout-out!: In some cases, you may not even want the software on the frontend nodes (hear me out before I retract it). If it’s a library that requires linking against before it can be used, then you probably have to have it unless you require users to submit interactive jobs to some dedicated build nodes to do their compilation. You’ll find that when users have all their software needs in one place on the frontend nodes, that sometimes they try to run it there, taking away resources from others. Now, a quick test run to make sure that their build is correct is probably no big deal, but some users will run their full-on science experiments (or pre- and post-processing steps) on the login nodes! We like to encourage those folks to submit jobs to the compute nodes. You could, but they probably wouldn’t like, cripple or not install some libraries on the login nodes to prevent this, but we just watch those systems like a hawk, given that we do want users to be able to build their programs on the login nodes. We don’t use EB, but we do collaborate with them to make it and Lmod compatible. We use something like OpenHPC to push RPMs we build in-house to manage software on our login and compute nodes. Sometimes, we also just install a binary package (like an ISV code like ANSYS or MATLAB) into a shared filesystem (one of our Lustre filesystems, usually) when making our own RPM is too cumbersome, and then use Lmod to make it available and visible to our users. There are more strategies for this than you can imagine, so settle on a few and keep it simple for you! Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435| Fax: (512) 475-9445 On 1/17/18, 7:48 AM, "slurm-users on behalf of Vanzo, Davide" slurm-users-boun...@lists.schedmd.com on="" behalf="" of="" davide.va...@vanderbilt.edu="" wrote: Ciao Elisabetta, I second John's reply. On our cluster we install software on the shared parallel filesystem with EasyBuild and use Lmod as a module front-end. Then users will simply load software in the job's environment by using the module command. Feel free to ping me directly if you need specific help. -- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) Vanderbilt University - Hill Center 201 (615)-875-9137 https://na01.safelinks.protection.outlook.com/?url=www.accre.vanderbilt.eduamp;data=02%7C01%7Cdavide.vanzo%40vanderbilt.edu%7Ca55a733721e34284029d08d55db2bfa4%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636517944686221529amp;sdata=4qU%2FqW28JoTtmWYE9Jyjc1VeKOT7U4aiMQdsjXnAVYg%3Damp;reserved=0 On 2018-01-17 07:28:31-06:00 slurm-users wrote: Hi, let's say I need to execute a python script with slurm. The script require a particular library installed on the system like numpy. If the library is not installed to the system, it is necessary to install it on the master AND the nodes, right? This has to be done on each machine separately or there's a way to install it one time for all the machine (master and nodes)? Elisabetta /slurm-users-boun...@lists.schedmd.com
Re: [slurm-users] Slurm and available libraries
I’d go slightly further, though I do appreciate the Lmod shout-out!: In some cases, you may not even want the software on the frontend nodes (hear me out before I retract it). If it’s a library that requires linking against before it can be used, then you probably have to have it unless you require users to submit interactive jobs to some dedicated build nodes to do their compilation. You’ll find that when users have all their software needs in one place on the frontend nodes, that sometimes they try to run it there, taking away resources from others. Now, a quick test run to make sure that their build is correct is probably no big deal, but some users will run their full-on science experiments (or pre- and post-processing steps) on the login nodes! We like to encourage those folks to submit jobs to the compute nodes. You could, but they probably wouldn’t like, cripple or not install some libraries on the login nodes to prevent this, but we just watch those systems like a hawk, given that we do want users to be able to build their programs on the login nodes. We don’t use EB, but we do collaborate with them to make it and Lmod compatible. We use something like OpenHPC to push RPMs we build in-house to manage software on our login and compute nodes. Sometimes, we also just install a binary package (like an ISV code like ANSYS or MATLAB) into a shared filesystem (one of our Lustre filesystems, usually) when making our own RPM is too cumbersome, and then use Lmod to make it available and visible to our users. There are more strategies for this than you can imagine, so settle on a few and keep it simple for you! Best, Bill. -- Bill Barth, Ph.D., Director, HPC bba...@tacc.utexas.edu| Phone: (512) 232-7069 Office: ROC 1.435| Fax: (512) 475-9445 On 1/17/18, 7:48 AM, "slurm-users on behalf of Vanzo, Davide"wrote: Ciao Elisabetta, I second John's reply. On our cluster we install software on the shared parallel filesystem with EasyBuild and use Lmod as a module front-end. Then users will simply load software in the job's environment by using the module command. Feel free to ping me directly if you need specific help. -- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) Vanderbilt University - Hill Center 201 (615)-875-9137 www.accre.vanderbilt.edu On 2018-01-17 07:28:31-06:00 slurm-users wrote: Hi, let's say I need to execute a python script with slurm. The script require a particular library installed on the system like numpy. If the library is not installed to the system, it is necessary to install it on the master AND the nodes, right? This has to be done on each machine separately or there's a way to install it one time for all the machine (master and nodes)? Elisabetta
Re: [slurm-users] Slurm and available libraries
I should also say that Modules should be easy to install on Ubuntu. It will be the package named "environment-modules" You probably will have to edit the configuration file a little bit since the default install will assume al lModules files are local. You need to set your MODULESPATH to include a shared directory where you will keep all your Modules files. This really is a lot easier than it sounds. On 17 January 2018 at 14:48, Vanzo, Davidewrote: > Ciao Elisabetta, > > I second John's reply. > On our cluster we install software on the shared parallel filesystem with > EasyBuild and use Lmod as a module front-end. Then users will simply load > software in the job's environment by using the module command. > > Feel free to ping me directly if you need specific help. > > -- > *Davide Vanzo, PhD* > Application Developer > Adjunct Assistant Professor of Chemical and Biomolecular Engineering > Advanced Computing Center for Research and Education (ACCRE) > Vanderbilt University - Hill Center 201 > (615)-875-9137 <(615)%20875-9137> > www.accre.vanderbilt.edu > > > On 2018-01-17 07:28:31-06:00 slurm-users wrote: > > Hi, > let's say I need to execute a python script with slurm. The script require > a particular library installed on the system like numpy. > If the library is not installed to the system, it is necessary to install > it on the master AND the nodes, right? This has to be done on each machine > separately or there's a way to install it one time for all the machine > (master and nodes)? > Elisabetta > >
Re: [slurm-users] Slurm and available libraries
I can highly recommend EasyBuild as an easy way to provide software packages as "modules" to your cluster. We have been very pleased with EasyBuild in our cluster. I made some notes about installing EasyBuild in a Wiki page: https://wiki.fysik.dtu.dk/niflheim/EasyBuild_modules We use CentOS 7 Linux. Also, if you want information about Slurm setup, I have written another set of Wiki pages: https://wiki.fysik.dtu.dk/niflheim/SLURM /Ole On 01/17/2018 02:39 PM, John Hearns wrote: Hi Elisabetta. No, you normally do not need to install software on all the compute nodes separately. It is quite common to use the 'modules' environment to manage software like this http://www.admin-magazine.com/HPC/Articles/Environment-Modules Once you have numpy installed on a shared drive on the cluster, and have a Modules file in place, your users put this at the start of their job scripts: module load numpy You might also want to look at Easybuild http://easybuild.readthedocs.io/en/latest/Introduction.html There are Easybuild 'recipes' for numpy. We use them where I work. On 17 January 2018 at 14:28, Elisabetta Falivene> wrote: Hi, let's say I need to execute a python script with slurm. The script require a particular library installed on the system like numpy. If the library is not installed to the system, it is necessary to install it on the master AND the nodes, right? This has to be done on each machine separately or there's a way to install it one time for all the machine (master and nodes)? Elisabetta
Re: [slurm-users] Slurm and available libraries
Ciao Elisabetta, I second John's reply. On our cluster we install software on the shared parallel filesystem with EasyBuild and use Lmod as a module front-end. Then users will simply load software in the job's environment by using the module command. Feel free to ping me directly if you need specific help. -- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) Vanderbilt University - Hill Center 201 (615)-875-9137 www.accre.vanderbilt.edu On 2018-01-17 07:28:31-06:00 slurm-users wrote: Hi, let's say I need to execute a python script with slurm. The script require a particular library installed on the system like numpy. If the library is not installed to the system, it is necessary to install it on the master AND the nodes, right? This has to be done on each machine separately or there's a way to install it one time for all the machine (master and nodes)? Elisabetta
Re: [slurm-users] Slurm and available libraries
Hi Elisabetta. No, you normally do not need to install software on all the compute nodes separately. It is quite common to use the 'modules' environment to manage software like this http://www.admin-magazine.com/HPC/Articles/Environment-Modules Once you have numpy installed on a shared drive on the cluster, and have a Modules file in place, your users put this at the start of their job scripts: module load numpy You might also want to look at Easybuild http://easybuild.readthedocs.io/en/latest/Introduction.html There are Easybuild 'recipes' for numpy. We use them where I work. On 17 January 2018 at 14:28, Elisabetta Falivenewrote: > Hi, > let's say I need to execute a python script with slurm. The script require > a particular library installed on the system like numpy. > If the library is not installed to the system, it is necessary to install > it on the master AND the nodes, right? This has to be done on each machine > separately or there's a way to install it one time for all the machine > (master and nodes)? > > Elisabetta >
[slurm-users] Slurm and available libraries
Hi, let's say I need to execute a python script with slurm. The script require a particular library installed on the system like numpy. If the library is not installed to the system, it is necessary to install it on the master AND the nodes, right? This has to be done on each machine separately or there's a way to install it one time for all the machine (master and nodes)? Elisabetta