Re: [slurm-users] Heterogeneous HPC
On Thursday, 19 September 2019, at 19:27:38 (-0400), Fulcomer, Samuel wrote: > I obviously haven't been keeping up with any security concerns over the use > of Singularity. In a 2-3 sentence nutshell, what are they? So before I do that, if you have a few minutes, I do think you'll find it worth your time to go to https://youtu.be/H6VrjowOOF4?t=2361 (it'll start about 39 minutes in) and watch at least those next 8 or so minutes. I go into some detail about the security track records of multiple container runtimes and provide factual data so that folks can make their own risk assessments rather than just giving my personal opinion. (The video does cut off the right side of the slides, but the slide deck is available at https://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-19-22663 for anyone interested.) If you really don't want to watch the video, though, I can provide a few of the data points. First off, if you have not read it before, you really should read Matthias Gerstner's assessment after doing a code review and security audit on Singularity 2.6.0 to see if it could be packaged for SuSE: https://www.openwall.com/lists/oss-security/2018/12/12/2 The quotes I used on the slide for my talk came from comments he made in the linked SuSE Bugzilla bug -- which, for unknown reasons, was re-locked by SuSE after previously being unlocked once the bug report was public! -- regarding whether or not, and under what constraints, to include and support Singularity on SuSE. Matthias is a widely respected security expert in the OSS community, so I trust his assessment and insight. And his audit alone found 5 or 6 CVE-worthy vulnerabilities at once. Additionally, as I mentioned in the video, during the 3-year period 2016-2018, there were at least 17 different vulnerabilities found in Singularity. Also, of the 9 releases they did during 2018, 7 of those were security releases to fix vulnerabilities (and frequently more than 1 at a time). That's...not great. Especially in an environment like ours where saying "security is important" is an understatement of nuclear proportions! ;-) And finally, while we were hopeful that the rewrite in Go (version 3.0 and above) would correct the security failings in the code, there've already been multiple serious vulnerabilities (all grouped together under a single CVE identifier, CVE-2019-11328), at least one of which was essentially a replica of one of the flaws fixed in 2.6.0 under CVE-2018-12021! And you don't need to take my word for it, either: https://www.openwall.com/lists/oss-security/2019/05/16/1 It's hard to say if the above trend will continue...but not all sites can afford to take those kinds of risks. And while Shifter's security track record is spotless to date, I would still summarize the overall lesson to be learned as, "Don't use privileged container runtimes. Use user namespaces. That's what they're there for." And before anyone yells at me, yes I know Singularity advertises user namespace support and non-setuid operation. But it doesn't seem to be very widely used or adequately exercised, and AFAICT the default mode of operation in both RPMs and build-from-src is via setuid binaries. So using a natively unprivileged runtime still seems the less risky choice, in my personal assessment. Yes, I know that was more than a "2-3 sentence nutshell," but hopefully it was helpful anyway! :-) Michael -- Michael E. Jennings HPC Systems Team, Los Alamos National Laboratory Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605
Re: [slurm-users] Heterogeneous HPC
Hey Michael, I obviously haven't been keeping up with any security concerns over the use of Singularity. In a 2-3 sentence nutshell, what are they? I've been annoyed by NVIDIA's docker distribution for DGX-1 & friends. We've been setting up an ersatz-secure SIngularity environment for use of mid-range DUA data like dbGaP. Regards, Sam On Thu, Sep 19, 2019 at 4:38 PM Michael Jennings wrote: > On Friday, 20 September 2019, at 00:03:28 (+0430), > Mahmood Naderan wrote: > > > For the replies. Matlab was an example. I would also like to create > > to containers for OpenFoam with different versions. Then a user can > > choose what he actually wants. > > All modern container runtimes support the OCI standard container > format originally authored by Docker, Inc. and contributed to the Open > Container Initiative (OCI) as the starting point for their standard. > So your best bet would be to go to Docker Hub (hub.docker.com) and > search for the applications you're interested in, or (in the case of > commercial software) ask your vendor if they supply containers for > their packages and under what terms. > > If you're comfortable with building as root, you can likely build your > own containers without too much trouble, but in order to build > containers without privilege, you'll need very recent Podman/Buildah > (or current Charliecloud plus Spokeo and umoci, if your Dockerfile is > supported by ch-grow). > > > I would also like to know, if the technologies you mentioned can be > > deployed in multinode clusters. Currently, we use Rocks 7. Should I > > install singularity (or others) on all nodes or just the frontend? > > And then, can users use "srun" or "salloc" for interactively login > > to a node and run the container or not? > > Most folks invoke the container runtime using srun, either in their > job script or as part of an interactive session. There are several > examples in the Charliecloud docs, for example, here: > > https://hpc.github.io/charliecloud/tutorial.html#your-first-single-node-multi-process-jobs > > But yes, you will likely need the container runtime installed on every > node. Most large HPC centers use Slurm, so you should have no problem > getting any or all of them to integrate well with your existing Slurm > installation. :-) > > That said, I *do* recommend watching at least that last video before > you make your final decision on runtime. With containers, as with any > technology, you're far more likely to get factual information from > folks who aren't trying to sell something! ;-) > > Having personally deployed, tested, and evaluated over a dozen > different container solutions -- including every major HPC container > system as well as implementing a few of my own -- I can tell you with > absolute certainty that there's no single right answer to "What > container system should I use?" There are several correct answers > depending on your use case and security & UX requirements. > > Michael > > -- > Michael E. Jennings > HPC Systems Team, Los Alamos National Laboratory > Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605 > >
Re: [slurm-users] Heterogeneous HPC
On Thursday, 19 September 2019, at 20:00:40 (+), Goetz, Patrick G wrote: > On 9/19/19 8:22 AM, Thomas M. Payerle wrote: > > one of our clusters > > is still running RHEL6, and while containers based on Ubuntu 16, > > Debian 8, or RHEL7 all appear to work properly, > > containers based on Ubuntu 18 or Debian 9 will die with "Kernel too > > old" errors. > > I think the idea generally is to have your container host be the newest > version, with containers providing a home for legacy (or at most > contemporary) software stacks. Never heard of anyone trying to do it > the other way around, but appreciate this proof of concept that it's a > bad idea. And contrary to popular belief, it's not just the kernel that comes into play with container compatibility. There's a surprisingly complex, nuanced interplay between the kernel, glibc, libgcc, and /lib64/ld-linux-x86-64.so.2 for almost any containerized application, let alone the standard challenges of MPI libraries, HSN/GPU device interfaces/drivers, etc. With containers, many of the old problems are new again! You'd cringe if I told you about some of the nastier container compatibility challenges we've run into/heard about There are general rules of thumb that will help container portability (limit container size, link statically as much as possible, build against the oldest possible stuff but run on the newest possible stuff, et al.), but the biggest one is: Don't expect containers to solve all your problems; they don't. Much of the time you're just exchanging the prior set of problems for a new set. :-) And don't forget to test with VMs as well, not just containers. Depending on complexity, computation and I/O patterns, and other similar factors, an appropriately paravirtualized VM could wind up being on par with native/containerized code in many circumstances, or at least within acceptable tolerances. And VMs offer many advantages in terms of safety, separation, and sanity vs. containerization. Michael -- Michael E. Jennings HPC Systems Team, Los Alamos National Laboratory Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605
Re: [slurm-users] Heterogeneous HPC
On Friday, 20 September 2019, at 00:03:28 (+0430), Mahmood Naderan wrote: > For the replies. Matlab was an example. I would also like to create > to containers for OpenFoam with different versions. Then a user can > choose what he actually wants. All modern container runtimes support the OCI standard container format originally authored by Docker, Inc. and contributed to the Open Container Initiative (OCI) as the starting point for their standard. So your best bet would be to go to Docker Hub (hub.docker.com) and search for the applications you're interested in, or (in the case of commercial software) ask your vendor if they supply containers for their packages and under what terms. If you're comfortable with building as root, you can likely build your own containers without too much trouble, but in order to build containers without privilege, you'll need very recent Podman/Buildah (or current Charliecloud plus Spokeo and umoci, if your Dockerfile is supported by ch-grow). > I would also like to know, if the technologies you mentioned can be > deployed in multinode clusters. Currently, we use Rocks 7. Should I > install singularity (or others) on all nodes or just the frontend? > And then, can users use "srun" or "salloc" for interactively login > to a node and run the container or not? Most folks invoke the container runtime using srun, either in their job script or as part of an interactive session. There are several examples in the Charliecloud docs, for example, here: https://hpc.github.io/charliecloud/tutorial.html#your-first-single-node-multi-process-jobs But yes, you will likely need the container runtime installed on every node. Most large HPC centers use Slurm, so you should have no problem getting any or all of them to integrate well with your existing Slurm installation. :-) That said, I *do* recommend watching at least that last video before you make your final decision on runtime. With containers, as with any technology, you're far more likely to get factual information from folks who aren't trying to sell something! ;-) Having personally deployed, tested, and evaluated over a dozen different container solutions -- including every major HPC container system as well as implementing a few of my own -- I can tell you with absolute certainty that there's no single right answer to "What container system should I use?" There are several correct answers depending on your use case and security & UX requirements. Michael -- Michael E. Jennings HPC Systems Team, Los Alamos National Laboratory Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605
Re: [slurm-users] Heterogeneous HPC
On 9/19/19 8:22 AM, Thomas M. Payerle wrote: > one of our clusters > is still running RHEL6, and while containers based on Ubuntu 16, > Debian 8, or RHEL7 all appear to work properly, > containers based on Ubuntu 18 or Debian 9 will die with "Kernel too > old" errors. I think the idea generally is to have your container host be the newest version, with containers providing a home for legacy (or at most contemporary) software stacks. Never heard of anyone trying to do it the other way around, but appreciate this proof of concept that it's a bad idea.
Re: [slurm-users] Heterogeneous HPC
Never used Rocks, but as far as Slurm or anything else is concerned, Singularity is just another program. It will need to be accessible from any compute nodes you want to use it on (whether that’s from OS-installed packages, from a shared NFS area, or whatever shouldn’t matter). So your user will still just use srun, salloc, sbatch, or whatever to invoke Singularity. My local docs for this are at [1], and they use the normal Singularity RPM from EPEL. [1] https://its.tntech.edu/display/MON/Using+Containers+in+Your+HPC+Account > On Sep 19, 2019, at 2:33 PM, Mahmood Naderan wrote: > > External Email Warning > This email originated from outside the university. Please use caution when > opening attachments, clicking links, or responding to requests. > For the replies. Matlab was an example. I would also like to create to > containers for OpenFoam with different versions. Then a user can choose what > he actually wants. > > I would also like to know, if the technologies you mentioned can be deployed > in multinode clusters. Currently, we use Rocks 7. Should I install > singularity (or others) on all nodes or just the frontend? > And then, can users use "srun" or "salloc" for interactively login to a node > and run the container or not? > > Regards, > Mahmood > > > > > On Thu, Sep 19, 2019 at 8:03 PM Michael Jennings wrote: > > Docker is the wrong choice for HPC, at least today. But Podman, from > Red Hat's CRI-O project, is a drop-in replacement for Docker which > doesn't use the client-server model of Docker and therefore addresses > many of the challenges with trying to run Docker for HPC user jobs. > > There's also LANL's Charliecloud, which is a highly optimized > container runtime that (unlike the other options in this space, save > Podman) DOES NOT require any root privileges whatsoever, not even at > install time. For (hopefully obvious) security reasons, you are far > safer using one of the unprivileged options. > > Here at Los Alamos, we use both Charliecloud and Podman/Buildah along > with the Spokeo and umoci tools. While we do not permit Singularity > on our systems for security reasons and don't run Shifter because it > requires privilege, we have had Charliecloud deployed and actively > used on both our Classified and Open Science systems for well over a > year now, and we are in the process of getting Podman/Buildah and > friends into the Secure systems as we speak. > > (Note that all of the above require RHEL7 or higher; if you need RHEL6 > support, you'll want to check out Shifter.) > > Here are some videos of talks that might help you get up-to-speed on > this subject: > > "LISA18 - Containers and Security on Planet X" > (https://youtu.be/F3qCvZMzUtE) - Why containers matter for HPC, what > makes HPC so different from the typical Docker/AppC use cases, and how > to choose the right solution for your site. > > "Charliecloud - Unprivileged Containers for HPC" > (https://youtu.be/ESsZgcaP-ZQ) - What containers actually are under > the hood, how they work, what they are good for, and how to get up and > running with Charliecloud in under 5 minutes. > > "Container Mythbusters" (https://youtu.be/FFyXdgWXD3A) - Dispelling > common misconceptions and debunking propaganda around containers, > container runtime security, and when/how you should (and should NOT) > use containers. > > Hope those help! > Michael > > -- > Michael E. Jennings > HPC Systems Team, Los Alamos National Laboratory > Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605 >
Re: [slurm-users] Heterogeneous HPC
For the replies. Matlab was an example. I would also like to create to containers for OpenFoam with different versions. Then a user can choose what he actually wants. I would also like to know, if the technologies you mentioned can be deployed in multinode clusters. Currently, we use Rocks 7. Should I install singularity (or others) on all nodes or just the frontend? And then, can users use "srun" or "salloc" for interactively login to a node and run the container or not? Regards, Mahmood On Thu, Sep 19, 2019 at 8:03 PM Michael Jennings wrote: > > Docker is the wrong choice for HPC, at least today. But Podman, from > Red Hat's CRI-O project, is a drop-in replacement for Docker which > doesn't use the client-server model of Docker and therefore addresses > many of the challenges with trying to run Docker for HPC user jobs. > > There's also LANL's Charliecloud, which is a highly optimized > container runtime that (unlike the other options in this space, save > Podman) DOES NOT require any root privileges whatsoever, not even at > install time. For (hopefully obvious) security reasons, you are far > safer using one of the unprivileged options. > > Here at Los Alamos, we use both Charliecloud and Podman/Buildah along > with the Spokeo and umoci tools. While we do not permit Singularity > on our systems for security reasons and don't run Shifter because it > requires privilege, we have had Charliecloud deployed and actively > used on both our Classified and Open Science systems for well over a > year now, and we are in the process of getting Podman/Buildah and > friends into the Secure systems as we speak. > > (Note that all of the above require RHEL7 or higher; if you need RHEL6 > support, you'll want to check out Shifter.) > > Here are some videos of talks that might help you get up-to-speed on > this subject: > > "LISA18 - Containers and Security on Planet X" > (https://youtu.be/F3qCvZMzUtE) - Why containers matter for HPC, what > makes HPC so different from the typical Docker/AppC use cases, and how > to choose the right solution for your site. > > "Charliecloud - Unprivileged Containers for HPC" > (https://youtu.be/ESsZgcaP-ZQ) - What containers actually are under > the hood, how they work, what they are good for, and how to get up and > running with Charliecloud in under 5 minutes. > > "Container Mythbusters" (https://youtu.be/FFyXdgWXD3A) - Dispelling > common misconceptions and debunking propaganda around containers, > container runtime security, and when/how you should (and should NOT) > use containers. > > Hope those help! > Michael > > -- > Michael E. Jennings > HPC Systems Team, Los Alamos National Laboratory > Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605 > >
Re: [slurm-users] Sharing a single machine between two groups; What's the best way define this in slurm config?
Probably your best bet is to use QoS's to accomplish this. Be advised that suspending jobs still leaves them in memory space. -Paul Edmon- On 9/18/19 9:16 PM, Benjamin Wong wrote: Hello, I plan to purchase a GPU machine with 8 GPUs which will be shared between group A and group B. Group A is an existing group with SLURM nodes. Group B has no SLURM nodes but will have access to half of the resources on one SLURM node. I'm trying to figure out how to get SLURM to implement the policies I want below: * If both groups are using the machine evenly, then I want the resources to be split evenly. * If only group A is using the resources, then they will consume all the resources and vice versa. * If group A is using all resources but group B begins requesting resources, then group A will suspend half of its work for group B to use resources. Vice versa applies. What's the best way to implement this? Should I have two halves of a machine in two different partitions? Looking forward to hints, Ben Wong
Re: [slurm-users] Heterogeneous HPC
MATLAB container at NVIDIA’s NGC: https://ngc.nvidia.com/catalog/containers/partners:matlab Should be compatible with Docker and Singularity, but read the fine print on licensing. > On Sep 19, 2019, at 8:22 AM, Thomas M. Payerle wrote: > > While I agree containers can be quite useful in HPC environments for > dealing with applications requiring > different library versions, there are limitations. In particular, the > kernel inside the container is the same > as running outside the container. Where this seems to be most > problematic is when trying to use containers > based on a much newer OS distribution than the distribution of the > containing system. I.e., one of our clusters > is still running RHEL6, and while containers based on Ubuntu 16, > Debian 8, or RHEL7 all appear to work properly, > containers based on Ubuntu 18 or Debian 9 will die with "Kernel too > old" errors. (Basically, the glibc in those > distros require a newer kernel than shipped with RHEL6). VMs should > not experience those issues, as the > kernel running in the VM need not be the same kernel as running in the > host system. > > But I have found containers helpful (we use Singularity), particularly > for applications. Not as useful for software libraries, > as those tend to not want to be "self-contained" and containers are > all about "self-contained". > > I am unaware of a container image for Matlab, but I suspect that is > more due to licensing/support than technical issues. > You could probably build a Matlab container based on some Mathworks > supported distribution and run on a distribution > not supported by Mathworks, but I doubt Mathworks would be willing to > provide support for that mode of operation. > > > > On Thu, Sep 19, 2019 at 6:55 AM Mahmood Naderan wrote: >> >> Thanks. Singularity seems to be interesting. I will try it. >> >> Regards, >> Mahmood >> >> >> >> >> On Thu, Sep 19, 2019 at 2:49 PM Christoph Brüning >> wrote: >>> >>> Dear Mahmood, >>> >>> Docker is somewhat tricky, because it needs a daemon running and there >>> is no fine grained control over who is allowed to start and stop >>> containers. Also getting the container on the node can be unpleasant >>> (docker hub? private registry? build docker containers on the node >>> before running them?). I would recommend against it! >>> >>> However, there are projects like Singularity or Charliecloud designed to >>> bring the "bring your own environment" idea to HPC. >>> >>> We have Singularity installed, and some of our users use it. It seems to >>> work reasonably well, as I have heard no complaint except that the >>> available version is somewhat outdated... >>> >>> Best, >>> Christoph >>> >>> >>> On 19/09/2019 10.08, Mahmood Naderan wrote: Hi The question is not directly related to Slurm, but is actually related to the people in this community. For heterogeneous environments, where different operating systems, application and library versions are needed for HPC users, I would like to know it using docker/containers is better than yielding virtual machines? Actually, it is lighter than VM, however, I haven't seen a docker image for Matlab for example. If that is possible, can Slurm be used to schedule containers? If someone has any experience using docker in HPC clusters, please let me know. Regards, Mahmood >>> >>> -- >>> Dr. Christoph Brüning >>> Universität Würzburg >>> Rechenzentrum >>> Am Hubland >>> D-97074 Würzburg >>> Tel.: +49 931 31-80499 >>> > > > -- > Tom Payerle > DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu > 5825 University Research Park (301) 405-6135 > University of Maryland > College Park, MD 20740-3831
Re: [slurm-users] Heterogeneous HPC
While I agree containers can be quite useful in HPC environments for dealing with applications requiring different library versions, there are limitations. In particular, the kernel inside the container is the same as running outside the container. Where this seems to be most problematic is when trying to use containers based on a much newer OS distribution than the distribution of the containing system. I.e., one of our clusters is still running RHEL6, and while containers based on Ubuntu 16, Debian 8, or RHEL7 all appear to work properly, containers based on Ubuntu 18 or Debian 9 will die with "Kernel too old" errors. (Basically, the glibc in those distros require a newer kernel than shipped with RHEL6). VMs should not experience those issues, as the kernel running in the VM need not be the same kernel as running in the host system. But I have found containers helpful (we use Singularity), particularly for applications. Not as useful for software libraries, as those tend to not want to be "self-contained" and containers are all about "self-contained". I am unaware of a container image for Matlab, but I suspect that is more due to licensing/support than technical issues. You could probably build a Matlab container based on some Mathworks supported distribution and run on a distribution not supported by Mathworks, but I doubt Mathworks would be willing to provide support for that mode of operation. On Thu, Sep 19, 2019 at 6:55 AM Mahmood Naderan wrote: > > Thanks. Singularity seems to be interesting. I will try it. > > Regards, > Mahmood > > > > > On Thu, Sep 19, 2019 at 2:49 PM Christoph Brüning > wrote: >> >> Dear Mahmood, >> >> Docker is somewhat tricky, because it needs a daemon running and there >> is no fine grained control over who is allowed to start and stop >> containers. Also getting the container on the node can be unpleasant >> (docker hub? private registry? build docker containers on the node >> before running them?). I would recommend against it! >> >> However, there are projects like Singularity or Charliecloud designed to >> bring the "bring your own environment" idea to HPC. >> >> We have Singularity installed, and some of our users use it. It seems to >> work reasonably well, as I have heard no complaint except that the >> available version is somewhat outdated... >> >> Best, >> Christoph >> >> >> On 19/09/2019 10.08, Mahmood Naderan wrote: >> > Hi >> > The question is not directly related to Slurm, but is actually related >> > to the people in this community. >> > >> > For heterogeneous environments, where different operating systems, >> > application and library versions are needed for HPC users, I would like >> > to know it using docker/containers is better than yielding virtual >> > machines? >> > >> > Actually, it is lighter than VM, however, I haven't seen a docker image >> > for Matlab for example. If that is possible, can Slurm be used to >> > schedule containers? >> > If someone has any experience using docker in HPC clusters, please let >> > me know. >> > >> > >> > Regards, >> > Mahmood >> > >> > >> >> -- >> Dr. Christoph Brüning >> Universität Würzburg >> Rechenzentrum >> Am Hubland >> D-97074 Würzburg >> Tel.: +49 931 31-80499 >> -- Tom Payerle DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu 5825 University Research Park (301) 405-6135 University of Maryland College Park, MD 20740-3831
[slurm-users] slurm config :: set up a workdir for each job
Hi! Is there a method for setting up a work directory unique for each job from a system setting? and than clean that up? can i use somehow the prologue and epilogue sections? Thank you! Adrian -- -- Adrian Sevcenco, Ph.D. | Institute of Space Science - ISS, Romania| adrian.sevcenco at {cern.ch,spacescience.ro} | --
Re: [slurm-users] Heterogeneous HPC
Thanks. Singularity seems to be interesting. I will try it. Regards, Mahmood On Thu, Sep 19, 2019 at 2:49 PM Christoph Brüning < christoph.bruen...@uni-wuerzburg.de> wrote: > Dear Mahmood, > > Docker is somewhat tricky, because it needs a daemon running and there > is no fine grained control over who is allowed to start and stop > containers. Also getting the container on the node can be unpleasant > (docker hub? private registry? build docker containers on the node > before running them?). I would recommend against it! > > However, there are projects like Singularity or Charliecloud designed to > bring the "bring your own environment" idea to HPC. > > We have Singularity installed, and some of our users use it. It seems to > work reasonably well, as I have heard no complaint except that the > available version is somewhat outdated... > > Best, > Christoph > > > On 19/09/2019 10.08, Mahmood Naderan wrote: > > Hi > > The question is not directly related to Slurm, but is actually related > > to the people in this community. > > > > For heterogeneous environments, where different operating systems, > > application and library versions are needed for HPC users, I would like > > to know it using docker/containers is better than yielding virtual > machines? > > > > Actually, it is lighter than VM, however, I haven't seen a docker image > > for Matlab for example. If that is possible, can Slurm be used to > > schedule containers? > > If someone has any experience using docker in HPC clusters, please let > > me know. > > > > > > Regards, > > Mahmood > > > > > > -- > Dr. Christoph Brüning > Universität Würzburg > Rechenzentrum > Am Hubland > D-97074 Würzburg > Tel.: +49 931 31-80499 > >
Re: [slurm-users] Heterogeneous HPC
Dear Mahmood, Docker is somewhat tricky, because it needs a daemon running and there is no fine grained control over who is allowed to start and stop containers. Also getting the container on the node can be unpleasant (docker hub? private registry? build docker containers on the node before running them?). I would recommend against it! However, there are projects like Singularity or Charliecloud designed to bring the "bring your own environment" idea to HPC. We have Singularity installed, and some of our users use it. It seems to work reasonably well, as I have heard no complaint except that the available version is somewhat outdated... Best, Christoph On 19/09/2019 10.08, Mahmood Naderan wrote: Hi The question is not directly related to Slurm, but is actually related to the people in this community. For heterogeneous environments, where different operating systems, application and library versions are needed for HPC users, I would like to know it using docker/containers is better than yielding virtual machines? Actually, it is lighter than VM, however, I haven't seen a docker image for Matlab for example. If that is possible, can Slurm be used to schedule containers? If someone has any experience using docker in HPC clusters, please let me know. Regards, Mahmood -- Dr. Christoph Brüning Universität Würzburg Rechenzentrum Am Hubland D-97074 Würzburg Tel.: +49 931 31-80499
Re: [slurm-users] Heterogeneous HPC
Hallo Mahmood, in our current system (which does not run with Slurm) we have deployed the community edition of Singularity as a software module. https://sylabs.io/singularity/ I have no practical experience yet but from what I've read so far, Singularity is also supposed to work quite well with Slurm. Actually, from the scheduler's point of view, running a Singularity container image is very much like running any other application anyway. However, the provision of images with commercial software installed inside may also be subject to licensing terms that need to be resolved. Best regards Jürgen -- Jürgen Salk Scientific Software & Compute Services (SSCS) Kommunikations- und Informationszentrum (kiz) Universität Ulm Telefon: +49 (0)731 50-22478 Telefax: +49 (0)731 50-22471 * Mahmood Naderan [190919 12:38]: > Hi > The question is not directly related to Slurm, but is actually related to > the people in this community. > > For heterogeneous environments, where different operating systems, > application and library versions are needed for HPC users, I would like to > know it using docker/containers is better than yielding virtual machines? > > Actually, it is lighter than VM, however, I haven't seen a docker image for > Matlab for example. If that is possible, can Slurm be used to schedule > containers? > If someone has any experience using docker in HPC clusters, please let me > know. > > > Regards, > Mahmood -- GPG A997BA7A | 87FC DA31 5F00 C885 0DC3 E28F BD0D 4B33 A997 BA7A
[slurm-users] Heterogeneous HPC
Hi The question is not directly related to Slurm, but is actually related to the people in this community. For heterogeneous environments, where different operating systems, application and library versions are needed for HPC users, I would like to know it using docker/containers is better than yielding virtual machines? Actually, it is lighter than VM, however, I haven't seen a docker image for Matlab for example. If that is possible, can Slurm be used to schedule containers? If someone has any experience using docker in HPC clusters, please let me know. Regards, Mahmood