Re: [easybuild] easybuild new user experience

2015-02-11 Thread Olav Smørholm
On Wed, Feb 11, 2015 at 10:30:30AM -0600, Jack Perdue wrote:
> On 02/11/2015 10:09 AM, Cook, Malcolm wrote:
> >Hi Fotis & Stuart,
> >
> >>>What do people do/recommend for multiple OS environments?  We are
> >  >> currently CentOS 6 but will eventually move to C7.  I'm thinking I
> >  >> will want a separate application tree for each OS (/projects/app-c6
> >  >> and /projects/app-c7).
> >  >>
> >  >> How do people deal with software with frequent updates (java) or
> >  >> security issues?  Do you rebuild and remove old packages?
> >  >
> >  >You may be able to handle both of the above needs,
> >  >by using the concept of buildsets, mentioned in p. over here:
> >  
> > >https://archive.fosdem.org/2014/schedule/event/hpc_devroom_hpcbios/attachments/slides/491/export/events/attachments/hpc_
> >  >devroom_hpcbios/slides/491/FOSDEM14_HPC_devroom_09_HPC_BIOS.pdf
> >  >
> >  >In principle, the idea is that you create self-contained directory areas
> >  >with complete build trees, including modules, at a given point in time.
> >  >I've calling them /opt/apps/HPCBIOS.MMDD but any kind of tag will 
> > just do.
> >  >
> >  >Then you might create symlinks like:
> >  >  /opt/apps/sandybridge -> /opt/apps/*.MMDD
> >  >
> >  >I used a dubious example name above but you get the idea.
> >
> >Fotis, I note your example name, "sandybridge",  apparently encoded an intel 
> >processor microarchitecture, NOT the name of a linux distribution (such as 
> >c6 or c7 for releases of centOS, as proposed).   I'm trying  to understand 
> >the implications of possibly needing to support a heterogeneous environment 
> >having multiple CentOS versions (6.5 and 7.x) on multiple core types 
> >(sandybridge) and would appreciate any more clarity here.   Are you possibly 
> >suggesting that buildsets for each combination of microprocessor and OS 
> >version might be appropriate
> >(provoking visions of /opt/apps/{sandybridge,Nehalem}/centOS{6,7}/MMDD )
> >
> >??
> We certainly have need for such a thing.  Our cluster is mostly Ivy Bridge
> (with AVX2 support), including the login nodes where I do most my builds.
> However, we have some systems (1-2TB) with older chipsets that don't
> have AVX.  So anything built with optarch=True (i.e. -xHost) on the login
> nodes won't work on the big mem nodes.
> 
> I could do the build on the big mem (older chipset) nodes but then won't
> get the benefit of AVX on the newer nodes.  So somehow providing a way
> to distinguish based on the chip set would be kind of nice.

You can do this with modules, could have a look at how Lmod does this.


-- 
Olav
> ($.02)
> 
> jack
> 


Re: [easybuild] easybuild new user experience

2015-02-11 Thread Jack Perdue

On 02/11/2015 10:09 AM, Cook, Malcolm wrote:

Hi Fotis & Stuart,


What do people do/recommend for multiple OS environments?  We are

  >> currently CentOS 6 but will eventually move to C7.  I'm thinking I
  >> will want a separate application tree for each OS (/projects/app-c6
  >> and /projects/app-c7).
  >>
  >> How do people deal with software with frequent updates (java) or
  >> security issues?  Do you rebuild and remove old packages?
  >
  >You may be able to handle both of the above needs,
  >by using the concept of buildsets, mentioned in p. over here:
  
>https://archive.fosdem.org/2014/schedule/event/hpc_devroom_hpcbios/attachments/slides/491/export/events/attachments/hpc_
  >devroom_hpcbios/slides/491/FOSDEM14_HPC_devroom_09_HPC_BIOS.pdf
  >
  >In principle, the idea is that you create self-contained directory areas
  >with complete build trees, including modules, at a given point in time.
  >I've calling them /opt/apps/HPCBIOS.MMDD but any kind of tag will just 
do.
  >
  >Then you might create symlinks like:
  >  /opt/apps/sandybridge -> /opt/apps/*.MMDD
  >
  >I used a dubious example name above but you get the idea.

Fotis, I note your example name, "sandybridge",  apparently encoded an intel 
processor microarchitecture, NOT the name of a linux distribution (such as c6 or c7 for 
releases of centOS, as proposed).   I'm trying  to understand the implications of 
possibly needing to support a heterogeneous environment having multiple CentOS versions 
(6.5 and 7.x) on multiple core types (sandybridge) and would appreciate any more clarity 
here.   Are you possibly suggesting that buildsets for each combination of microprocessor 
and OS version might be appropriate
(provoking visions of /opt/apps/{sandybridge,Nehalem}/centOS{6,7}/MMDD )

??

We certainly have need for such a thing.  Our cluster is mostly Ivy Bridge
(with AVX2 support), including the login nodes where I do most my builds.
However, we have some systems (1-2TB) with older chipsets that don't
have AVX.  So anything built with optarch=True (i.e. -xHost) on the login
nodes won't work on the big mem nodes.

I could do the build on the big mem (older chipset) nodes but then won't
get the benefit of AVX on the newer nodes.  So somehow providing a way
to distinguish based on the chip set would be kind of nice.

($.02)

jack


RE: [easybuild] easybuild new user experience

2015-02-11 Thread Cook, Malcolm
Hi Fotis & Stuart,

>> What do people do/recommend for multiple OS environments?  We are
 >> currently CentOS 6 but will eventually move to C7.  I'm thinking I
 >> will want a separate application tree for each OS (/projects/app-c6
 >> and /projects/app-c7).
 >>
 >> How do people deal with software with frequent updates (java) or
 >> security issues?  Do you rebuild and remove old packages?
 >
 >You may be able to handle both of the above needs,
 >by using the concept of buildsets, mentioned in p. over here:
 >https://archive.fosdem.org/2014/schedule/event/hpc_devroom_hpcbios/attachments/slides/491/export/events/attachments/hpc_
 >devroom_hpcbios/slides/491/FOSDEM14_HPC_devroom_09_HPC_BIOS.pdf
 >
 >In principle, the idea is that you create self-contained directory areas
 >with complete build trees, including modules, at a given point in time.
 >I've calling them /opt/apps/HPCBIOS.MMDD but any kind of tag will just do.
 >
 >Then you might create symlinks like:
 >  /opt/apps/sandybridge -> /opt/apps/*.MMDD
 >
 >I used a dubious example name above but you get the idea.

Fotis, I note your example name, "sandybridge",  apparently encoded an intel 
processor microarchitecture, NOT the name of a linux distribution (such as c6 
or c7 for releases of centOS, as proposed).   I'm trying  to understand the 
implications of possibly needing to support a heterogeneous environment having 
multiple CentOS versions (6.5 and 7.x) on multiple core types (sandybridge) and 
would appreciate any more clarity here.   Are you possibly suggesting that 
buildsets for each combination of microprocessor and OS version might be 
appropriate
(provoking visions of /opt/apps/{sandybridge,Nehalem}/centOS{6,7}/MMDD )

??

 ~Malcolm


Re: [easybuild] easybuild new user experience

2015-02-11 Thread Fotis Georgatos

Hi Stuart,

On Feb 11, 2015, at 2:32 AM, Stuart Barkley  wrote:
> I've been looking at easybuild for a little while now and it appears
> to be a suitable framework for a good portion of our needs.

If you provide software for HPC you are on the right road.

> I think I eventually used "--dry-run --robot" and grepped the output
> to determine the full dependency lists and create individual 'eb
> --stop=fetch --force' commands for every package dependency.  I also
> used something like this to determine an internal build order so I
> didn't need to --robot 50 packages which would all churn and fail on
> the same dependency.

I think this is the best you can do to pre-download sources nowadays,
if case you don’t provide continuous network access, as you described.

> What do people do/recommend for multiple OS environments?  We are
> currently CentOS 6 but will eventually move to C7.  I'm thinking I
> will want a separate application tree for each OS (/projects/app-c6
> and /projects/app-c7).
> 
> How do people deal with software with frequent updates (java) or
> security issues?  Do you rebuild and remove old packages?

You may be able to handle both of the above needs, 
by using the concept of buildsets, mentioned in p. over here:
https://archive.fosdem.org/2014/schedule/event/hpc_devroom_hpcbios/attachments/slides/491/export/events/attachments/hpc_devroom_hpcbios/slides/491/FOSDEM14_HPC_devroom_09_HPC_BIOS.pdf

In principle, the idea is that you create self-contained directory areas
with complete build trees, including modules, at a given point in time.
I’ve calling them /opt/apps/HPCBIOS.MMDD but any kind of tag will just do.

Then you might create symlinks like:
/opt/apps/sandybridge -> /opt/apps/*.MMDD

I used a dubious example name above but you get the idea.

The point is that you modify your symlinks when you transit maintenance windows,
so that you don’t break your running jobs. 

> Do people recommend updating toolchains and the dependencies (e.g.
> eliminate multiple versions of python 2.7)?  Since this will be our
> initial build I don't really need python 2.7.3, 2.7.5 or .2.7.6.
> Should I just use the --try options to build for 2.7.8?

If you do life sciences/bioinformatics you’ll likely need Python/2.7.3;
you can always try to upgrade forward towards whatever, however
I’d advise you to do this as a secondary stage, or you’ll be distracted
by compatibility issues that raise quickly in these situations.

> information about the 1.7 toolchain, I need to review that email
> thread.

I’m the perpetrator; it’s basically a GCC/4.8.x fresh toolchain,
which should serve for a little while. If I was you , I would do the
above step and then try something like:

eb HPCBIOS_LifeSciences*eb --try-toolchain=goolf,1.7.20 -r

That should provide for a quite interesting effect :)

Feel free to depart then with other improvements, you may do plenty.

> Is there any planned support for beta/test packages?  At the present
> time I plan to have two separate easybuild installations and after
> building/testing an application in the beta installation I'll rebuild
> it in the production installation.  (With C6 and C7 installations this
> might multiply to 4 easybuild installations.)

I am afraid that yes, it’s the only way, if you want to play safe.

A rationalized way to handle it, is to create “future” buildsets
which you test extensively before a maintenance window; once
you arrive at the crossing point, you move symlinks around.
You can easily roll-back and roll-forward on the cheap, in this way.

> Is there any planned support (or something I'm missing) for allowing
> someone else to use an existing easybuild installation (including
> toolchains and other packages) to build a test/prototype package?  I
> would like to have other staff be able to build local test packages
> off of the production toolchains and be able to give me a set of
> proposed/working .eb files for final build.  This is similar to the
> above beta question, except I don't expect others to be able to
> maintain their own easybuild installation (and don't want them writing
> into the production installation).

Try MODULEPATH=$MODULES_USERAREA:$MODULES_SYSTEMAREA
You may also add $MODULES_GROUPAREA in the mix, you get the idea.

Many HPC users can actually maintain their easyconfigs, quite well.
They are having much bigger problems to solve, most of the time :)
If they are python-adverse though, just invite them to write shell scripts
and then it’s trivial to convert them to the EB world of things.

> Thanks for the work on easybuild.  This looks like it will address a
> long standing need for building stable versions of software for our
> users.

The best thanks is sharing easyconfigs ;-)

enjoy,
Fotis


-- 
echo "sysadmin know better bash than english" | sed s/min/mins/ \
  | sed 's/better bash/bash better/' # signal detected in a CERN forum








Re: [easybuild] easybuild new user experience

2015-02-11 Thread Ward Poelmans
On Wed, Feb 11, 2015 at 2:32 AM, Stuart Barkley  wrote:
> I've been looking at easybuild for a little while now and it appears
> to be a suitable framework for a good portion of our needs.

Welcome to EB!

> My cluster does not have direct access to the Internet so I can't use
> any of the automated download procedures.  I actually consider this a
> feature in that I eventually know exactly what software is running and
> that nothing was downloaded and installed that wasn't intended.

What is stopping your users from copying other software to your
cluster and running it? Or using a ssh portforward/socks proxy to get
internet on your cluster?

> Dependencies/source package downloads:
>
> I think I eventually used "--dry-run --robot" and grepped the output
> to determine the full dependency lists and create individual 'eb
> --stop=fetch --force' commands for every package dependency.  I also
> used something like this to determine an internal build order so I
> didn't need to --robot 50 packages which would all churn and fail on
> the same dependency.

We could create an `--fetch-only` option that doesn't stop on a failed download.


> Some questions:
>
> How do people deal with software with frequent updates (java) or
> security issues?  Do you rebuild and remove old packages?

For the security sensitive package, we use the OS provides ones
(openSSL, glibc, ...) and let the OS update them.



> Is there any planned support (or something I'm missing) for allowing
> someone else to use an existing easybuild installation (including
> toolchains and other packages) to build a test/prototype package?  I
> would like to have other staff be able to build local test packages
> off of the production toolchains and be able to give me a set of
> proposed/working .eb files for final build.  This is similar to the
> above beta question, except I don't expect others to be able to
> maintain their own easybuild installation (and don't want them writing
> into the production installation).

You can use a different `-installpath=` for the test packages? If the
production toolchains are in the module path, they will be used.


> Wish list (maybe for hack-a-thon):
>
> Don't automatically create $HOME/.local/easybuild (or any other top
> level directory).  Too many times I forgot to specify my configuration
> file name or fumble fingered something and ended up with a mess of
> files where I didn't intend.  Requiring the top level directory to
> already exist prevents a misconfigured easybuild from writing a bunch
> of stuff for later cleanup.  The error message should be clear about
> the directory name so that a simple copy/paste can be used to create
> the missing directory when it is truly missing.

I don't agree here. Almost all Linux programs will create their own
directory under $HOME without asking. It's the proper thing to do. If
you have troubles forgetting the specify certain options, add them to
the config.cg file? Or export them as bash variables if the change
from location to location.


> Process for removing a package.  It is probably fairly simple, mostly
> deleting the installation tree and the modules files.  Dependencies
> might be tricky (indicating all the dependencies and refuse to remove
> would be fine).

This is on the wish list for quite some time:
https://github.com/hpcugent/easybuild-framework/issues/1000

Ward


Re: [easybuild] easybuild new user experience

2015-02-11 Thread Jack Perdue

Howdy Stuart,

I'll offer an unofficial welcome to EasyBuild.  I can relate
to quite a bit of the below.  I'm not sure I have any good
answers yet.  But WELCOME  Your comments will not
go unnoticed

For example,  I just want to second one of your points
(while I ponder if I can better answer your other questions/needs
than the authors).

On 02/10/2015 07:32 PM, Stuart Barkley wrote:

Process for removing a package.  It is probably fairly simple, mostly
deleting the installation tree and the modules files.  Dependencies
might be tricky (indicating all the dependencies and refuse to remove
would be fine).


This would be a dream.

  eb --remove zlib/x.y.z # remove that zlib and everything that 
depended upon it

  and/or
  eb --remove GCC/4.9.0  # remove that toolchain and all modules, and 
install trees, associated with it


I'm curious if Lmod already has a dependency tracking thingie
that I can use to script, but if it could be automated into EasyBuild,
that would be really sweet (I do a lot of prototyping, especially since
working on the Power7).

jack


[easybuild] easybuild new user experience

2015-02-11 Thread Stuart Barkley
I've been looking at easybuild for a little while now and it appears
to be a suitable framework for a good portion of our needs.

Easybuild already has support for a lot of the packages we are
interested in running.  I was also able to find answers to several of
my early problems in the mailing list archives so have been silent up
till now.  The easybuild mailing list even had an important note about
a GPFS bug which I was seeing with easybuild which had also just
cropped up for one of our users in a different context.

At this point I've successfully built a large number packages of
interest to our user base (100+).  I have yet to actually run any of
them except the build toolchains, that will be the next much more
careful and deliberate set of steps.

It has been a month or so since I last worked with easybuild, but I
wanted to provide some feedback before my memory of the new user
experience fades.  Maybe some of these issues can be dealt with in the
hack-a-thon (I don't have any patches to supply at this time, sorry).

Initial bootstrapping:

My cluster does not have direct access to the Internet so I can't use
any of the automated download procedures.  I actually consider this a
feature in that I eventually know exactly what software is running and
that nothing was downloaded and installed that wasn't intended.

The initial bootstrapping process was a little awkward but I was
eventually able to succeed.  I used bootstrap_eb.py in a VM to get an
initial installation to copy over to the cluster.  Another part of my
bootstrapping involved alternately using easybuild to build 1.15.2 and
1.16.1.  Easybuild is not able to reinstall itself (the install fails
when it deletes the currently running version) but each version can
build and install the other version.

Dependencies/source package downloads:

I continue to use the VM installation to download the necessary source
packages for things.  Getting all of the dependencies for the large
number of packages I have built was a little awkward.

Fuzzy memory, but I think I had a number of times where I was
downloading things which are no longer there or require special
downloads and each time I fixed one problem I would get just a little
farther until the next problem download.  It would be nice if
easybuild would continue processing after running into errors with one
package.  Now that I have my local cache of the various distribution
files this is less of an issue since I will just continue to
accumulate files in $eb/sources.

I think I eventually used "--dry-run --robot" and grepped the output
to determine the full dependency lists and create individual 'eb
--stop=fetch --force' commands for every package dependency.  I also
used something like this to determine an internal build order so I
didn't need to --robot 50 packages which would all churn and fail on
the same dependency.

Some questions:

What do people do/recommend for multiple OS environments?  We are
currently CentOS 6 but will eventually move to C7.  I'm thinking I
will want a separate application tree for each OS (/projects/app-c6
and /projects/app-c7).

How do people deal with software with frequent updates (java) or
security issues?  Do you rebuild and remove old packages?

Do people recommend updating toolchains and the dependencies (e.g.
eliminate multiple versions of python 2.7)?  Since this will be our
initial build I don't really need python 2.7.3, 2.7.5 or .2.7.6.
Should I just use the --try options to build for 2.7.8?

What about toolchains?  I patch OpenMPI for grid engine (information
from the mailing list, but not tested).  Should I build this as a
custom toolchain instead of calling it goolf-1.4.10?  Should I
consider trying goolf-1.5.X or 1.6 or is it better to stick with the
1.4 toolchains which most eb scripts use?  I also just now see
information about the 1.7 toolchain, I need to review that email
thread.

Is there any planned support for beta/test packages?  At the present
time I plan to have two separate easybuild installations and after
building/testing an application in the beta installation I'll rebuild
it in the production installation.  (With C6 and C7 installations this
might multiply to 4 easybuild installations.)

Is there any planned support (or something I'm missing) for allowing
someone else to use an existing easybuild installation (including
toolchains and other packages) to build a test/prototype package?  I
would like to have other staff be able to build local test packages
off of the production toolchains and be able to give me a set of
proposed/working .eb files for final build.  This is similar to the
above beta question, except I don't expect others to be able to
maintain their own easybuild installation (and don't want them writing
into the production installation).

Wish list (maybe for hack-a-thon):

Don't automatically create $HOME/.local/easybuild (or any other top
level directory).  Too many times I forgot to specify my configuration
file name or fumbl