Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Michael Jennings
On Thursday, 19 September 2019, at 19:27:38 (-0400),
Fulcomer, Samuel wrote:

> I obviously haven't been keeping up with any security concerns over the use
> of Singularity. In a 2-3 sentence nutshell, what are they?

So before I do that, if you have a few minutes, I do think you'll find
it worth your time to go to https://youtu.be/H6VrjowOOF4?t=2361 (it'll
start about 39 minutes in) and watch at least those next 8 or so minutes.
I go into some detail about the security track records of multiple
container runtimes and provide factual data so that folks can make their
own risk assessments rather than just giving my personal opinion.  (The
video does cut off the right side of the slides, but the slide deck is
available at 
https://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-19-22663
for anyone interested.)

If you really don't want to watch the video, though, I can provide a few
of the data points.

First off, if you have not read it before, you really should read
Matthias Gerstner's assessment after doing a code review and security
audit on Singularity 2.6.0 to see if it could be packaged for SuSE:
https://www.openwall.com/lists/oss-security/2018/12/12/2
The quotes I used on the slide for my talk came from comments he made in
the linked SuSE Bugzilla bug -- which, for unknown reasons, was
re-locked by SuSE after previously being unlocked once the bug report
was public! -- regarding whether or not, and under what constraints, to
include and support Singularity on SuSE.  Matthias is a widely respected
security expert in the OSS community, so I trust his assessment and
insight.  And his audit alone found 5 or 6 CVE-worthy vulnerabilities at
once.

Additionally, as I mentioned in the video, during the 3-year period
2016-2018, there were at least 17 different vulnerabilities found in
Singularity.  Also, of the 9 releases they did during 2018, 7 of those
were security releases to fix vulnerabilities (and frequently more than
1 at a time).  That's...not great.  Especially in an environment like
ours where saying "security is important" is an understatement of
nuclear proportions! ;-)

And finally, while we were hopeful that the rewrite in Go (version 3.0
and above) would correct the security failings in the code, there've
already been multiple serious vulnerabilities (all grouped together
under a single CVE identifier, CVE-2019-11328), at least one of which
was essentially a replica of one of the flaws fixed in 2.6.0 under
CVE-2018-12021!  And you don't need to take my word for it, either:
https://www.openwall.com/lists/oss-security/2019/05/16/1

It's hard to say if the above trend will continue...but not all sites
can afford to take those kinds of risks.

And while Shifter's security track record is spotless to date, I would
still summarize the overall lesson to be learned as, "Don't use
privileged container runtimes.  Use user namespaces.  That's what
they're there for."  And before anyone yells at me, yes I know
Singularity advertises user namespace support and non-setuid operation.
But it doesn't seem to be very widely used or adequately exercised, and
AFAICT the default mode of operation in both RPMs and build-from-src is
via setuid binaries.  So using a natively unprivileged runtime still
seems the less risky choice, in my personal assessment.

Yes, I know that was more than a "2-3 sentence nutshell," but hopefully
it was helpful anyway! :-)

Michael

-- 
Michael E. Jennings 
HPC Systems Team, Los Alamos National Laboratory
Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Fulcomer, Samuel
Hey Michael,

I obviously haven't been keeping up with any security concerns over the use
of Singularity. In a 2-3 sentence nutshell, what are they?

I've been annoyed by NVIDIA's docker distribution for DGX-1 & friends.

We've been setting up an ersatz-secure SIngularity environment for use of
mid-range DUA data like dbGaP.

Regards,
Sam



On Thu, Sep 19, 2019 at 4:38 PM Michael Jennings  wrote:

> On Friday, 20 September 2019, at 00:03:28 (+0430),
> Mahmood Naderan wrote:
>
> > For the replies. Matlab was an example. I would also like to create
> > to containers for OpenFoam with different versions. Then a user can
> > choose what he actually wants.
>
> All modern container runtimes support the OCI standard container
> format originally authored by Docker, Inc. and contributed to the Open
> Container Initiative (OCI) as the starting point for their standard.
> So your best bet would be to go to Docker Hub (hub.docker.com) and
> search for the applications you're interested in, or (in the case of
> commercial software) ask your vendor if they supply containers for
> their packages and under what terms.
>
> If you're comfortable with building as root, you can likely build your
> own containers without too much trouble, but in order to build
> containers without privilege, you'll need very recent Podman/Buildah
> (or current Charliecloud plus Spokeo and umoci, if your Dockerfile is
> supported by ch-grow).
>
> > I would also like to know, if the technologies you mentioned can be
> > deployed in multinode clusters. Currently, we use Rocks 7. Should I
> > install singularity (or others) on all nodes or just the frontend?
> > And then, can users use "srun" or "salloc" for interactively login
> > to a node and run the container or not?
>
> Most folks invoke the container runtime using srun, either in their
> job script or as part of an interactive session.  There are several
> examples in the Charliecloud docs, for example, here:
>
> https://hpc.github.io/charliecloud/tutorial.html#your-first-single-node-multi-process-jobs
>
> But yes, you will likely need the container runtime installed on every
> node.  Most large HPC centers use Slurm, so you should have no problem
> getting any or all of them to integrate well with your existing Slurm
> installation. :-)
>
> That said, I *do* recommend watching at least that last video before
> you make your final decision on runtime.  With containers, as with any
> technology, you're far more likely to get factual information from
> folks who aren't trying to sell something! ;-)
>
> Having personally deployed, tested, and evaluated over a dozen
> different container solutions -- including every major HPC container
> system as well as implementing a few of my own -- I can tell you with
> absolute certainty that there's no single right answer to "What
> container system should I use?"  There are several correct answers
> depending on your use case and security & UX requirements.
>
> Michael
>
> --
> Michael E. Jennings 
> HPC Systems Team, Los Alamos National Laboratory
> Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605
>
>


Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Michael Jennings
On Thursday, 19 September 2019, at 20:00:40 (+),
Goetz, Patrick G wrote:

> On 9/19/19 8:22 AM, Thomas M. Payerle wrote:
> > one of our clusters
> > is still running RHEL6, and while containers based on Ubuntu 16,
> > Debian 8, or RHEL7 all appear to work properly,
> > containers based on Ubuntu 18 or Debian 9 will die with "Kernel too
> > old" errors.
> 
> I think the idea generally is to have your container host be the newest 
> version, with containers providing a home for legacy (or at most 
> contemporary) software stacks.  Never heard of anyone trying to do it 
> the other way around, but appreciate this proof of concept that it's a 
> bad idea.

And contrary to popular belief, it's not just the kernel that comes
into play with container compatibility.  There's a surprisingly complex,
nuanced interplay between the kernel, glibc, libgcc, and
/lib64/ld-linux-x86-64.so.2 for almost any containerized application,
let alone the standard challenges of MPI libraries, HSN/GPU device
interfaces/drivers, etc.  With containers, many of the old problems are
new again!  You'd cringe if I told you about some of the nastier
container compatibility challenges we've run into/heard about

There are general rules of thumb that will help container portability
(limit container size, link statically as much as possible, build
against the oldest possible stuff but run on the newest possible stuff,
et al.), but the biggest one is:  Don't expect containers to solve all
your problems; they don't.  Much of the time you're just exchanging the
prior set of problems for a new set. :-)

And don't forget to test with VMs as well, not just containers.
Depending on complexity, computation and I/O patterns, and other similar
factors, an appropriately paravirtualized VM could wind up being on par
with native/containerized code in many circumstances, or at least within
acceptable tolerances.  And VMs offer many advantages in terms of
safety, separation, and sanity vs. containerization.

Michael

-- 
Michael E. Jennings 
HPC Systems Team, Los Alamos National Laboratory
Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Michael Jennings
On Friday, 20 September 2019, at 00:03:28 (+0430),
Mahmood Naderan wrote:

> For the replies. Matlab was an example. I would also like to create
> to containers for OpenFoam with different versions. Then a user can
> choose what he actually wants.

All modern container runtimes support the OCI standard container
format originally authored by Docker, Inc. and contributed to the Open
Container Initiative (OCI) as the starting point for their standard.
So your best bet would be to go to Docker Hub (hub.docker.com) and
search for the applications you're interested in, or (in the case of
commercial software) ask your vendor if they supply containers for
their packages and under what terms.

If you're comfortable with building as root, you can likely build your
own containers without too much trouble, but in order to build
containers without privilege, you'll need very recent Podman/Buildah
(or current Charliecloud plus Spokeo and umoci, if your Dockerfile is
supported by ch-grow).

> I would also like to know, if the technologies you mentioned can be
> deployed in multinode clusters. Currently, we use Rocks 7. Should I
> install singularity (or others) on all nodes or just the frontend?
> And then, can users use "srun" or "salloc" for interactively login
> to a node and run the container or not?

Most folks invoke the container runtime using srun, either in their
job script or as part of an interactive session.  There are several
examples in the Charliecloud docs, for example, here:
https://hpc.github.io/charliecloud/tutorial.html#your-first-single-node-multi-process-jobs

But yes, you will likely need the container runtime installed on every
node.  Most large HPC centers use Slurm, so you should have no problem
getting any or all of them to integrate well with your existing Slurm
installation. :-)

That said, I *do* recommend watching at least that last video before
you make your final decision on runtime.  With containers, as with any
technology, you're far more likely to get factual information from
folks who aren't trying to sell something! ;-)

Having personally deployed, tested, and evaluated over a dozen
different container solutions -- including every major HPC container
system as well as implementing a few of my own -- I can tell you with
absolute certainty that there's no single right answer to "What
container system should I use?"  There are several correct answers
depending on your use case and security & UX requirements.

Michael

-- 
Michael E. Jennings 
HPC Systems Team, Los Alamos National Laboratory
Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Goetz, Patrick G
On 9/19/19 8:22 AM, Thomas M. Payerle wrote:
> one of our clusters
> is still running RHEL6, and while containers based on Ubuntu 16,
> Debian 8, or RHEL7 all appear to work properly,
> containers based on Ubuntu 18 or Debian 9 will die with "Kernel too
> old" errors.

I think the idea generally is to have your container host be the newest 
version, with containers providing a home for legacy (or at most 
contemporary) software stacks.  Never heard of anyone trying to do it 
the other way around, but appreciate this proof of concept that it's a 
bad idea.



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Renfro, Michael
Never used Rocks, but as far as Slurm or anything else is concerned, 
Singularity is just another program. It will need to be accessible from any 
compute nodes you want to use it on (whether that’s from OS-installed packages, 
from a shared NFS area, or whatever shouldn’t matter).

So your user will still just use srun, salloc, sbatch, or whatever to invoke 
Singularity. My local docs for this are at [1], and they use the normal 
Singularity RPM from EPEL.

[1] https://its.tntech.edu/display/MON/Using+Containers+in+Your+HPC+Account

> On Sep 19, 2019, at 2:33 PM, Mahmood Naderan  wrote:
> 
> External Email Warning
> This email originated from outside the university. Please use caution when 
> opening attachments, clicking links, or responding to requests.
> For the replies. Matlab was an example. I would also like to create to 
> containers for OpenFoam with different versions. Then a user can choose what 
> he actually wants.
> 
> I would also like to know, if the technologies you mentioned can be deployed 
> in multinode clusters. Currently, we use Rocks 7. Should I install 
> singularity (or others) on all nodes or just the frontend?
> And then, can users use "srun" or "salloc" for interactively login to a node 
> and run the container or not?
> 
> Regards,
> Mahmood
> 
> 
> 
> 
> On Thu, Sep 19, 2019 at 8:03 PM Michael Jennings  wrote:
> 
> Docker is the wrong choice for HPC, at least today.  But Podman, from
> Red Hat's CRI-O project, is a drop-in replacement for Docker which
> doesn't use the client-server model of Docker and therefore addresses
> many of the challenges with trying to run Docker for HPC user jobs.
> 
> There's also LANL's Charliecloud, which is a highly optimized
> container runtime that (unlike the other options in this space, save
> Podman) DOES NOT require any root privileges whatsoever, not even at
> install time.  For (hopefully obvious) security reasons, you are far
> safer using one of the unprivileged options.
> 
> Here at Los Alamos, we use both Charliecloud and Podman/Buildah along
> with the Spokeo and umoci tools.  While we do not permit Singularity
> on our systems for security reasons and don't run Shifter because it
> requires privilege, we have had Charliecloud deployed and actively
> used on both our Classified and Open Science systems for well over a
> year now, and we are in the process of getting Podman/Buildah and
> friends into the Secure systems as we speak.
> 
> (Note that all of the above require RHEL7 or higher; if you need RHEL6
> support, you'll want to check out Shifter.)
> 
> Here are some videos of talks that might help you get up-to-speed on
> this subject:
> 
> "LISA18 - Containers and Security on Planet X"
> (https://youtu.be/F3qCvZMzUtE) - Why containers matter for HPC, what
> makes HPC so different from the typical Docker/AppC use cases, and how
> to choose the right solution for your site.
> 
> "Charliecloud - Unprivileged Containers for HPC"
> (https://youtu.be/ESsZgcaP-ZQ) - What containers actually are under
> the hood, how they work, what they are good for, and how to get up and
> running with Charliecloud in under 5 minutes.
> 
> "Container Mythbusters" (https://youtu.be/FFyXdgWXD3A) - Dispelling
> common misconceptions and debunking propaganda around containers,
> container runtime security, and when/how you should (and should NOT)
> use containers.
> 
> Hope those help!
> Michael
> 
> -- 
> Michael E. Jennings 
> HPC Systems Team, Los Alamos National Laboratory
> Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605
> 



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Mahmood Naderan
For the replies. Matlab was an example. I would also like to create to
containers for OpenFoam with different versions. Then a user can choose
what he actually wants.

I would also like to know, if the technologies you mentioned can be
deployed in multinode clusters. Currently, we use Rocks 7. Should I install
singularity (or others) on all nodes or just the frontend?
And then, can users use "srun" or "salloc" for interactively login to a
node and run the container or not?

Regards,
Mahmood




On Thu, Sep 19, 2019 at 8:03 PM Michael Jennings  wrote:

>
> Docker is the wrong choice for HPC, at least today.  But Podman, from
> Red Hat's CRI-O project, is a drop-in replacement for Docker which
> doesn't use the client-server model of Docker and therefore addresses
> many of the challenges with trying to run Docker for HPC user jobs.
>
> There's also LANL's Charliecloud, which is a highly optimized
> container runtime that (unlike the other options in this space, save
> Podman) DOES NOT require any root privileges whatsoever, not even at
> install time.  For (hopefully obvious) security reasons, you are far
> safer using one of the unprivileged options.
>
> Here at Los Alamos, we use both Charliecloud and Podman/Buildah along
> with the Spokeo and umoci tools.  While we do not permit Singularity
> on our systems for security reasons and don't run Shifter because it
> requires privilege, we have had Charliecloud deployed and actively
> used on both our Classified and Open Science systems for well over a
> year now, and we are in the process of getting Podman/Buildah and
> friends into the Secure systems as we speak.
>
> (Note that all of the above require RHEL7 or higher; if you need RHEL6
> support, you'll want to check out Shifter.)
>
> Here are some videos of talks that might help you get up-to-speed on
> this subject:
>
> "LISA18 - Containers and Security on Planet X"
> (https://youtu.be/F3qCvZMzUtE) - Why containers matter for HPC, what
> makes HPC so different from the typical Docker/AppC use cases, and how
> to choose the right solution for your site.
>
> "Charliecloud - Unprivileged Containers for HPC"
> (https://youtu.be/ESsZgcaP-ZQ) - What containers actually are under
> the hood, how they work, what they are good for, and how to get up and
> running with Charliecloud in under 5 minutes.
>
> "Container Mythbusters" (https://youtu.be/FFyXdgWXD3A) - Dispelling
> common misconceptions and debunking propaganda around containers,
> container runtime security, and when/how you should (and should NOT)
> use containers.
>
> Hope those help!
> Michael
>
> --
> Michael E. Jennings 
> HPC Systems Team, Los Alamos National Laboratory
> Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605
>
>


Re: [slurm-users] Sharing a single machine between two groups; What's the best way define this in slurm config?

2019-09-19 Thread Paul Edmon
Probably your best bet is to use QoS's to accomplish this.  Be advised 
that suspending jobs still leaves them in memory space.


-Paul Edmon-

On 9/18/19 9:16 PM, Benjamin Wong wrote:

Hello,

I plan to purchase a GPU machine with 8 GPUs which will be shared 
between group A and group B.  Group A is an existing group with SLURM 
nodes.  Group B has no SLURM nodes but will have access to half of the 
resources on one SLURM node.  I'm trying to figure out how to get 
SLURM to implement the policies I want below:


  * If both groups are using the machine evenly, then I want the
resources to be split evenly.
  * If only group A is using the resources, then they will consume all
the resources and vice versa.
  * If group A is using all resources but group B begins requesting
resources, then group A will suspend half of its work for group B
to use resources.  Vice versa applies.

What's the best way to implement this?  Should I have two halves of a 
machine in two different partitions?


Looking forward to hints,
Ben Wong


Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Renfro, Michael
MATLAB container at NVIDIA’s NGC: 
https://ngc.nvidia.com/catalog/containers/partners:matlab

Should be compatible with Docker and Singularity, but read the fine print on 
licensing.

> On Sep 19, 2019, at 8:22 AM, Thomas M. Payerle  wrote:
> 
> While I agree containers can be quite useful in HPC environments for
> dealing with applications requiring
> different library versions, there are limitations.  In particular, the
> kernel inside the container is the same
> as running outside the container.  Where this seems to be most
> problematic is when trying to use containers
> based on a much newer OS distribution than the distribution of the
> containing system.  I.e., one of our clusters
> is still running RHEL6, and while containers based on Ubuntu 16,
> Debian 8, or RHEL7 all appear to work properly,
> containers based on Ubuntu 18 or Debian 9 will die with "Kernel too
> old" errors.  (Basically, the glibc in those
> distros require a newer kernel than shipped with RHEL6).  VMs should
> not experience those issues, as the
> kernel running in the VM need not be the same kernel as running in the
> host system.
> 
> But I have found containers helpful (we use Singularity), particularly
> for applications.  Not as useful for software libraries,
> as those tend to not want to be "self-contained" and containers are
> all about "self-contained".
> 
> I am unaware of a container image for Matlab, but I suspect that is
> more due to licensing/support than technical issues.
> You could probably build a Matlab container based on some Mathworks
> supported distribution and run on a distribution
> not supported by Mathworks, but I doubt Mathworks would be willing to
> provide support for that mode of operation.
> 
> 
> 
> On Thu, Sep 19, 2019 at 6:55 AM Mahmood Naderan  wrote:
>> 
>> Thanks. Singularity seems to be interesting. I will try it.
>> 
>> Regards,
>> Mahmood
>> 
>> 
>> 
>> 
>> On Thu, Sep 19, 2019 at 2:49 PM Christoph Brüning 
>>  wrote:
>>> 
>>> Dear Mahmood,
>>> 
>>> Docker is somewhat tricky, because it needs a daemon running and there
>>> is no fine grained control over who is allowed to start and stop
>>> containers. Also getting the container on the node can be unpleasant
>>> (docker hub? private registry? build docker containers on the node
>>> before running them?). I would recommend against it!
>>> 
>>> However, there are projects like Singularity or Charliecloud designed to
>>> bring the "bring your own environment" idea to HPC.
>>> 
>>> We have Singularity installed, and some of our users use it. It seems to
>>> work reasonably well, as I have heard no complaint except that the
>>> available version is somewhat outdated...
>>> 
>>> Best,
>>> Christoph
>>> 
>>> 
>>> On 19/09/2019 10.08, Mahmood Naderan wrote:
 Hi
 The question is not directly related to Slurm, but is actually related
 to the people in this community.
 
 For heterogeneous environments, where different operating systems,
 application and library versions are needed for HPC users, I would like
 to know it using docker/containers is better than yielding virtual 
 machines?
 
 Actually, it is lighter than VM, however, I haven't seen a docker image
 for Matlab for example. If that is possible, can Slurm be used to
 schedule containers?
 If someone has any experience using docker in HPC clusters, please let
 me know.
 
 
 Regards,
 Mahmood
 
 
>>> 
>>> --
>>> Dr. Christoph Brüning
>>> Universität Würzburg
>>> Rechenzentrum
>>> Am Hubland
>>> D-97074 Würzburg
>>> Tel.: +49 931 31-80499
>>> 
> 
> 
> --
> Tom Payerle
> DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu
> 5825 University Research Park   (301) 405-6135
> University of Maryland
> College Park, MD 20740-3831



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Thomas M. Payerle
While I agree containers can be quite useful in HPC environments for
dealing with applications requiring
different library versions, there are limitations.  In particular, the
kernel inside the container is the same
as running outside the container.  Where this seems to be most
problematic is when trying to use containers
based on a much newer OS distribution than the distribution of the
containing system.  I.e., one of our clusters
is still running RHEL6, and while containers based on Ubuntu 16,
Debian 8, or RHEL7 all appear to work properly,
containers based on Ubuntu 18 or Debian 9 will die with "Kernel too
old" errors.  (Basically, the glibc in those
distros require a newer kernel than shipped with RHEL6).  VMs should
not experience those issues, as the
kernel running in the VM need not be the same kernel as running in the
host system.

But I have found containers helpful (we use Singularity), particularly
for applications.  Not as useful for software libraries,
as those tend to not want to be "self-contained" and containers are
all about "self-contained".

I am unaware of a container image for Matlab, but I suspect that is
more due to licensing/support than technical issues.
You could probably build a Matlab container based on some Mathworks
supported distribution and run on a distribution
not supported by Mathworks, but I doubt Mathworks would be willing to
provide support for that mode of operation.



On Thu, Sep 19, 2019 at 6:55 AM Mahmood Naderan  wrote:
>
> Thanks. Singularity seems to be interesting. I will try it.
>
> Regards,
> Mahmood
>
>
>
>
> On Thu, Sep 19, 2019 at 2:49 PM Christoph Brüning 
>  wrote:
>>
>> Dear Mahmood,
>>
>> Docker is somewhat tricky, because it needs a daemon running and there
>> is no fine grained control over who is allowed to start and stop
>> containers. Also getting the container on the node can be unpleasant
>> (docker hub? private registry? build docker containers on the node
>> before running them?). I would recommend against it!
>>
>> However, there are projects like Singularity or Charliecloud designed to
>> bring the "bring your own environment" idea to HPC.
>>
>> We have Singularity installed, and some of our users use it. It seems to
>> work reasonably well, as I have heard no complaint except that the
>> available version is somewhat outdated...
>>
>> Best,
>> Christoph
>>
>>
>> On 19/09/2019 10.08, Mahmood Naderan wrote:
>> > Hi
>> > The question is not directly related to Slurm, but is actually related
>> > to the people in this community.
>> >
>> > For heterogeneous environments, where different operating systems,
>> > application and library versions are needed for HPC users, I would like
>> > to know it using docker/containers is better than yielding virtual 
>> > machines?
>> >
>> > Actually, it is lighter than VM, however, I haven't seen a docker image
>> > for Matlab for example. If that is possible, can Slurm be used to
>> > schedule containers?
>> > If someone has any experience using docker in HPC clusters, please let
>> > me know.
>> >
>> >
>> > Regards,
>> > Mahmood
>> >
>> >
>>
>> --
>> Dr. Christoph Brüning
>> Universität Würzburg
>> Rechenzentrum
>> Am Hubland
>> D-97074 Würzburg
>> Tel.: +49 931 31-80499
>>


-- 
Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroadspaye...@umd.edu
5825 University Research Park   (301) 405-6135
University of Maryland
College Park, MD 20740-3831



[slurm-users] slurm config :: set up a workdir for each job

2019-09-19 Thread Adrian Sevcenco
Hi! Is there a method for setting up a work directory unique for each 
job from a system setting? and than clean that up?


can i use somehow the prologue and epilogue sections?

Thank you!
Adrian


--
--
Adrian Sevcenco, Ph.D.   |
Institute of Space Science - ISS, Romania|
adrian.sevcenco at {cern.ch,spacescience.ro} |
--



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Mahmood Naderan
Thanks. Singularity seems to be interesting. I will try it.

Regards,
Mahmood




On Thu, Sep 19, 2019 at 2:49 PM Christoph Brüning <
christoph.bruen...@uni-wuerzburg.de> wrote:

> Dear Mahmood,
>
> Docker is somewhat tricky, because it needs a daemon running and there
> is no fine grained control over who is allowed to start and stop
> containers. Also getting the container on the node can be unpleasant
> (docker hub? private registry? build docker containers on the node
> before running them?). I would recommend against it!
>
> However, there are projects like Singularity or Charliecloud designed to
> bring the "bring your own environment" idea to HPC.
>
> We have Singularity installed, and some of our users use it. It seems to
> work reasonably well, as I have heard no complaint except that the
> available version is somewhat outdated...
>
> Best,
> Christoph
>
>
> On 19/09/2019 10.08, Mahmood Naderan wrote:
> > Hi
> > The question is not directly related to Slurm, but is actually related
> > to the people in this community.
> >
> > For heterogeneous environments, where different operating systems,
> > application and library versions are needed for HPC users, I would like
> > to know it using docker/containers is better than yielding virtual
> machines?
> >
> > Actually, it is lighter than VM, however, I haven't seen a docker image
> > for Matlab for example. If that is possible, can Slurm be used to
> > schedule containers?
> > If someone has any experience using docker in HPC clusters, please let
> > me know.
> >
> >
> > Regards,
> > Mahmood
> >
> >
>
> --
> Dr. Christoph Brüning
> Universität Würzburg
> Rechenzentrum
> Am Hubland
> D-97074 Würzburg
> Tel.: +49 931 31-80499
>
>


Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Christoph Brüning

Dear Mahmood,

Docker is somewhat tricky, because it needs a daemon running and there 
is no fine grained control over who is allowed to start and stop 
containers. Also getting the container on the node can be unpleasant 
(docker hub? private registry? build docker containers on the node 
before running them?). I would recommend against it!


However, there are projects like Singularity or Charliecloud designed to 
bring the "bring your own environment" idea to HPC.


We have Singularity installed, and some of our users use it. It seems to 
work reasonably well, as I have heard no complaint except that the 
available version is somewhat outdated...


Best,
Christoph


On 19/09/2019 10.08, Mahmood Naderan wrote:

Hi
The question is not directly related to Slurm, but is actually related 
to the people in this community.


For heterogeneous environments, where different operating systems, 
application and library versions are needed for HPC users, I would like 
to know it using docker/containers is better than yielding virtual machines?


Actually, it is lighter than VM, however, I haven't seen a docker image 
for Matlab for example. If that is possible, can Slurm be used to 
schedule containers?
If someone has any experience using docker in HPC clusters, please let 
me know.



Regards,
Mahmood




--
Dr. Christoph Brüning
Universität Würzburg
Rechenzentrum
Am Hubland
D-97074 Würzburg
Tel.: +49 931 31-80499



Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Juergen Salk
Hallo Mahmood,

in our current system (which does not run with Slurm) we have deployed 
the community edition of Singularity as a software module. 

 https://sylabs.io/singularity/

I have no practical experience yet but from what I've read so far,
Singularity is also supposed to work quite well with Slurm. Actually,
from the scheduler's point of view, running a Singularity container
image is very much like running any other application anyway.

However, the provision of images with commercial software installed
inside may also be subject to licensing terms that need to be
resolved.

Best regards
Jürgen

-- 
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471


* Mahmood Naderan  [190919 12:38]:
> Hi
> The question is not directly related to Slurm, but is actually related to
> the people in this community.
> 
> For heterogeneous environments, where different operating systems,
> application and library versions are needed for HPC users, I would like to
> know it using docker/containers is better than yielding virtual machines?
> 
> Actually, it is lighter than VM, however, I haven't seen a docker image for
> Matlab for example. If that is possible, can Slurm be used to schedule
> containers?
> If someone has any experience using docker in HPC clusters, please let me
> know.
> 
> 
> Regards,
> Mahmood

-- 
GPG A997BA7A | 87FC DA31 5F00 C885 0DC3  E28F BD0D 4B33 A997 BA7A



[slurm-users] Heterogeneous HPC

2019-09-19 Thread Mahmood Naderan
Hi
The question is not directly related to Slurm, but is actually related to
the people in this community.

For heterogeneous environments, where different operating systems,
application and library versions are needed for HPC users, I would like to
know it using docker/containers is better than yielding virtual machines?

Actually, it is lighter than VM, however, I haven't seen a docker image for
Matlab for example. If that is possible, can Slurm be used to schedule
containers?
If someone has any experience using docker in HPC clusters, please let me
know.


Regards,
Mahmood