Re: [easybuild] Slurm/MPI error on importing tensorflow
Dear Loris, Isn’t it the opposite? Openmpi has to be built with Slurm properly? We had several issues with openmpi when it was compiled with dL-open and similar. I‘d have to check exact configuration flags . Best Andreas > Am 29.01.2020 um 14:40 schrieb Loris Bennett : > > Hi, > > Thinking about the problem, if it were a question of just rebuilding > Slurm with a different version of OpenMPI, then presumably other > MPI-programs would have issues with Slurm, but we haven't seen this. > > So I am still mystified. > > Cheers, > > Loris > > Loris Bennett writes: > >> Hi Kenneth, >> >> I have tried two different things: >> >> 1. >> >> Starting and interactive job as user 'loris' via Slurm on a GPU-node, >> loading the TensorFlow module, starting Python and then importing the >> python module 'tensorflow'. This triggers the original error below. >> >> 2. >> >> Logging directly into to the same GPU node as above as 'root', loading >> the TensorFlow module, starting Python and then importing the python >> module 'tensorflow'. This triggers the following warning: >> >> A process has executed an operation involving a call to the >> "fork()" system call to create a child process. Open MPI is currently >> operating in a condition that could result in memory corruption or >> other system errors; your job may hang, crash, or produce silent >> data corruption. The use of fork() (or system() or other calls that >> create child processes) is strongly discouraged. >> >> The process that invoked fork was: >> >>Local host: [[17982,1],0] (PID 47441) >> >> If you are *absolutely sure* that your application will successfully >> and correctly survive a call to fork(), you may disable this warning >> by setting the mpi_warn_on_fork MCA parameter to 0. >> >> I can then, however, successfully start a TensorFlow session: >> > sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) >> 2020-01-28 15:32:57.120084: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with >> properties: >> name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582 >> pciBusID: :5e:00.0 >> totalMemory: 10.92GiB freeMemory: 10.75GiB >> 2020-01-28 15:32:57.225946: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with >> properties: >> name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582 >> pciBusID: :d8:00.0 >> totalMemory: 10.92GiB freeMemory: 10.76GiB >> 2020-01-28 15:32:57.226648: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu >> devices: 0, 1 >> 2020-01-28 15:32:58.784466: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect >> StreamExecutor with strength 1 edge matrix: >> 2020-01-28 15:32:58.784506: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1 >> 2020-01-28 15:32:58.784513: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N Y >> 2020-01-28 15:32:58.784517: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: Y N >> 2020-01-28 15:32:58.784649: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow >> device (/job:localhost/replica:0/task:0/device:GPU:0 with 10386 MB memory) >> -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: >> :5e:00.0, compute capability: 6.1) >> 2020-01-28 15:32:58.785041: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow >> device (/job:localhost/replica:0/task:0/device:GPU:1 with 10398 MB memory) >> -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: >> :d8:00.0, compute capability: 6.1) >> 2020-01-28 15:32:58.785979: I >> tensorflow/core/common_runtime/process_util.cc:71] Creating new thread pool >> with default inter op setting: 2. Tune using inter_op_parallelism_threads >> for best performance. >> Device mapping: >> /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce >> GTX 1080 Ti, pci bus id: :5e:00.0, compute capability: 6.1 >> /job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: GeForce >> GTX 1080 Ti, pci bus id: :d8:00.0, compute capability: 6.1 >> 2020-01-28 15:32:58.786118: I >> tensorflow/core/common_runtime/direct_session.cc:317] Device mapping: >> /job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce >> GTX 1080 Ti, pci bus id: :5e:00.0, compute capability: 6.1 >> /job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: GeForce >> GTX 1080 Ti, pci bus id: :d8:00.0, compute capability: 6.1 >> >> So it looks as if the problem is a mismatch between the version of >> OpenMPI used to build Slurm (Version 1.10.7 from CentOS 7.7) and the >> version loaded by out TensorFlow module (OpenMPI/3.1.3). None of the >> values for the '--mpi' option for 'srun' make any difference. >> >> Perhaps Slurm needs to be rebuilt with OpenMPI 3.1.3, but my
Re: [easybuild] CP2K psmp?
Hi Loris, I don’t know for sure but I would guess that this could be linked with the toolchain opts where you usually specify if mpi is used or not which in turn would opt for versions of cp2k with or without mpi-support. Best regards, Andreas > Am 17.06.2019 um 16:23 schrieb Loris Bennett : > > Hi, > > A user of mine has asked about the psmp binary of CP2K. The easyconfig > CP2K-6.1-intel-2018a.eb, which I used, only builds the popt version. It > seems there has been interest in psmp for some past versions: > > * $CFGS1/c/CP2K/CP2K-3.0-intel-2016a.eb > * $CFGS1/c/CP2K/CP2K-3.0-intel-2016b-psmp.eb > * $CFGS1/c/CP2K/CP2K-3.0-intel-2016b.eb > * $CFGS1/c/CP2K/CP2K-3.0-intel-2017b.eb > * $CFGS1/c/CP2K/CP2K-3.0-intel-2018a.eb > * $CFGS1/c/CP2K/CP2K-4.1-foss-2016b-psmp.eb > * $CFGS1/c/CP2K/CP2K-4.1-intel-2016b.eb > * $CFGS1/c/CP2K/CP2K-5.1-foss-2018a.eb > * $CFGS1/c/CP2K/CP2K-5.1-intel-2017b.eb > * $CFGS1/c/CP2K/CP2K-5.1-intel-2018a.eb > * $CFGS1/c/CP2K/CP2K-6.1-foss-2019a.eb > * $CFGS1/c/CP2K/CP2K-6.1-intel-2018a.eb > > Is there any particular reason why the buildopts don't just specify, say, > > VERSION="sopt popt ssmp psmp" > > ? > > Cheers, > > Loris > > -- > Dr. Loris Bennett (Mr.) > ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
AW: AW: [easybuild] foss/2018a pmi slurm
Hi Kenneth, at least now the probem is also documented here. Best, Andreas Von: easybuild-requ...@lists.ugent.be <easybuild-requ...@lists.ugent.be> im Auftrag von Kenneth Hoste <kenneth.ho...@ugent.be> Gesendet: Dienstag, 8. Mai 2018 19:59:08 An: easybuild@lists.ugent.be Betreff: Re: AW: [easybuild] foss/2018a pmi slurm Hi Yann & Andreas, We removed --disable-dlopen from OpenMPI easyconfigs in EasyBuild v3.6.0, although it was for reasons other than SLURM compatibility (it was due to performance issues). See https://github.com/easybuilders/easybuild-easyconfigs/pull/6060 for more info. regards, Kenneth On 08/05/2018 19:40, Henkel, Andreas wrote: > Dear Yann, > > > I recently opened an issue at github for openmpi since I saw a PMI2_init > fail, too. For now, it boiled down to the option --disable-dlopen, which > is in the easyconfig I think and --disable-dlopen as well as > --enable-static imply f--disable-mca-dso. Anyways, if you remove the > --disable-dlopen from the config-opts (and/or enable-static) and rebuild > OpenMpi it should work. > > (https://github.com/open-mpi/ompi/issues/4338#issuecomment-384578916) > > > Best, > > Andreas > > > *Von:* easybuild-requ...@lists.ugent.be > <easybuild-requ...@lists.ugent.be> im Auftrag von Yann Sagon > <yann.sa...@unige.ch> > *Gesendet:* Dienstag, 8. Mai 2018 17:40:12 > *An:* easybuild@lists.ugent.be > *Betreff:* [easybuild] foss/2018a pmi slurm > Dear list, > > I installed foss/2018a without slurm specific flags (because I forgot) > and then I recompiled only openmpi with the following flags : > --with-slurm --with-pmi > > I think there is no need to recompile something else, but I may be > mistaken. When I try to submit a job using srun, I have the error about > pmi etc. > > PMI2_Init failed to intialize. Return code: 1 > > In slurm.conf, I have the following directive: MpiDefault=pmi2 > > I'm using slurm 17.11.5 > > According to ldd $(which mpirun) > > mpirun is using > [...] > libpmi.so.0 => /usr/lib64/libpmi.so.0 (0x2b53dec26000) > libpmi2.so.0 => /usr/lib64/libpmi2.so.0 (0x2b53dee2b000) > libmunge.so.2 => /usr/lib64/libmunge.so.2 (0x2b53df044000) > libslurm.so.32 => /usr/lib64/libslurm.so.32 (0x2b53e0683000) > [...] > > rpm -qf /usr/lib64/libslurm.so.32 > slurm-17.11.5-1.el6.x86_64 > > rpm -qf /usr/lib64/libpmi.so.0 > slurm-17.11.5-1.el6.x86_64 > > rpm -qf /usr/lib64/libpmi2.so.0 > slurm-17.11.5-1.el6.x86_64 > > rpm -qf /usr/lib64/libmunge.so.2 > munge-libs-0.5.10-1.el6.x86_64 > > Is there someone who can confirm that foss/2018a is compatible with > slurm 17.11.5? > > Best >
AW: [easybuild] foss/2018a pmi slurm
Dear Yann, I recently opened an issue at github for openmpi since I saw a PMI2_init fail, too. For now, it boiled down to the option --disable-dlopen, which is in the easyconfig I think and --disable-dlopen as well as --enable-static imply f--disable-mca-dso. Anyways, if you remove the --disable-dlopen from the config-opts (and/or enable-static) and rebuild OpenMpi it should work. (https://github.com/open-mpi/ompi/issues/4338#issuecomment-384578916) Best, Andreas Von: easybuild-requ...@lists.ugent.beim Auftrag von Yann Sagon Gesendet: Dienstag, 8. Mai 2018 17:40:12 An: easybuild@lists.ugent.be Betreff: [easybuild] foss/2018a pmi slurm Dear list, I installed foss/2018a without slurm specific flags (because I forgot) and then I recompiled only openmpi with the following flags : --with-slurm --with-pmi I think there is no need to recompile something else, but I may be mistaken. When I try to submit a job using srun, I have the error about pmi etc. PMI2_Init failed to intialize. Return code: 1 In slurm.conf, I have the following directive: MpiDefault=pmi2 I'm using slurm 17.11.5 According to ldd $(which mpirun) mpirun is using [...] libpmi.so.0 => /usr/lib64/libpmi.so.0 (0x2b53dec26000) libpmi2.so.0 => /usr/lib64/libpmi2.so.0 (0x2b53dee2b000) libmunge.so.2 => /usr/lib64/libmunge.so.2 (0x2b53df044000) libslurm.so.32 => /usr/lib64/libslurm.so.32 (0x2b53e0683000) [...] rpm -qf /usr/lib64/libslurm.so.32 slurm-17.11.5-1.el6.x86_64 rpm -qf /usr/lib64/libpmi.so.0 slurm-17.11.5-1.el6.x86_64 rpm -qf /usr/lib64/libpmi2.so.0 slurm-17.11.5-1.el6.x86_64 rpm -qf /usr/lib64/libmunge.so.2 munge-libs-0.5.10-1.el6.x86_64 Is there someone who can confirm that foss/2018a is compatible with slurm 17.11.5? Best
[easybuild] Intel toolchain versionsuffix vs. toolchain
Hi all, I would need another input. I'm just rebuilding the intel-Software and was wondering why GCC 6.3.0 is put in the versionsuffix or a local variable gccver instead of as toolchain. Has this some systematic reason I currently just don't see? Or is it some "historic" reason? Looking forward to you answers. Best, Andreas
Re: [easybuild] RE: Intel Compiler / Mpi
Hi jack, Well that's exactly what I see. With binutils 2.25 of CentOS 7 it works. With GCC-6.3.0-2.27 it doesn't. But if I use it and switch tools/binutils to 2.28 it works again. In some issues I found something about mrelax-relocations=no resolving similar linking issues with 2.26 and 2.27, haven't checked it though. In my case it only happens with mpicc mpicxx (using gnu Compiler with intel mpi). Using mpiicpc on the same source is fine. Anyways, I see the two choices of solving it: either I use System binutils or I go for 2.28. Andreas > Am 19.05.2017 um 18:33 schrieb Jack Perdue <j-per...@tamu.edu>: > > BTW/FWIW, > > We've been using what should essentially > should be named GCCcore/6.3.0-system on > RH6/7 without the issues we ran into with > binutil-2.26+ and kin. On RH6 that's > 2.20.51.0.2. On RH7 that's 2.25.1-22. > I use the same GCCcore .eb on both with > osdependencies including binutils-devel. > > jack > >> On 05/19/2017 09:50 AM, Jack Perdue wrote: >> My question is, "is the problem in the underlying >> binutils that built GCCcore or is the problem in the >> binutils that is bundled with that GCCcore in the >> GCC module?" >> >> I've made my pleas in the past to tag the GCCcore >> with the binutils that was used to build it so that >> one could build different GCCcore's based on different >> versions. Thus far they've fallen on listening, but >> not necessarily agreeing, ears. Was disappointed >> to see GCCcore-7.1.0.eb released without that and will >> be renaming it locally GCCcore-7.1.0-2.28 so that when >> issues like this arise they can be more easily studied. >> Of course, that will make pull-requests using the neato >> command line options a bit trickier. >> >> jack >> >> >> >>> On 05/19/2017 09:29 AM, Åke Sandgren wrote: >>> Ok, so really looks like a bug in ld then... >>> Annoying!! >>> >>>> On 05/19/2017 03:34 PM, Henkel, Andreas wrote: >>>> I tried a few things and can reproduce this with binutils 2.27 no matter >>>> if the intel tools were built with or without eb. >>>> With binutils 2.25 and 2.28 it works without complaining about >>>> pthread_sigmask. It seems like specific to the behavior of ld of 2.27 (or >>>> and 2.26). >>>> >>>> >>>> >>>>> Am 18.05.2017 um 09:35 schrieb Åke Sandgren <ake.sandg...@hpc2n.umu.se>: >>>>> >>>>> Just took a deeper look at this. >>>>> >>>>> >>>>> Evil stuff is happening. >>>>> >>>>> Take a onefile simple hello-world type mpi program (h.c) >>>>> >>>>> check mpicxx -o hw h.c -show and mpiicpc -o hw h.c -show outputs >>>>> >>>>> Now take the mpicxx output and add >>>>> -Wl,-t -Wl,-y,pthread_sigmask 2>&1 | less >>>>> >>>>> Look for sigmask and then compare the same for the mpiicpc command. >>>>> >>>>> The mpicxx command does: >>>>> >>>>> /hpc2n/eb/software/Compiler/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/intel64/lib/release_mt/libmpi.so: >>>>> >>>>> reference to pthread_sigmask >>>>> /usr/lib/x86_64-linux-gnu/libdl.so >>>>> /usr/lib/x86_64-linux-gnu/librt.so >>>>> /usr/lib/x86_64-linux-gnu/librt.so: reference to pthread_sigmask >>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>> /lib/x86_64-linux-gnu/libpthread.so.0: definition of pthread_sigmask >>>>> >>>>> The mpiicpc does: >>>>> >>>>> /hpc2n/eb/software/Compiler/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/intel64/lib/release_mt/libmpi.so: >>>>> >>>>> reference to pthread_sigmask >>>>> -ldl (/usr/lib/x86_64-linux-gnu//libdl.so) >>>>> -lrt (/usr/lib/x86_64-linux-gnu//librt.so) >>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>> /lib/x86_64-linux-gnu/libpthread.so.0: definition of pthread_sigmask >>>>> >>>>> Notice the difference for the "rt" lib. >>>>> >>>>> Then take the g++ command and add a -lpthread just after -lmpi, making >>>>> the command: >>>>> >>>>> ... -lmpifort -lmpi -lpthread -lmpigi -ldl -lrt -lpthread ... >>>>> >>>>> Now the thing works. >>>>> I'd say that this is a bug in ld. >>>>> >>>>> So, if users want to use gcc with intel mpi they have to explicitly add >>>>> a -lpthread to the link line. That way ld picks up the pthread defs >>>>> before -lmpi gets looked at. >>>>> >>>>>> On 05/18/2017 08:42 AM, Henkel, Andreas wrote: >>>>>> @Ake >>>>>> I tried with mpiicpc and it worked. >>>>>> I get the undefined reference only when using mpicxx. Using >>>>>> mpi/mvapich2/2.2-GCC-6.3.0-2.27 and mpi/openmpi/1.10.3-GCC-5.4.0-2.26 >>>>>> mpicxx >>>>>> works. >>>>>> This leads me back to the initial question: for people using gcc with >>>>>> intelmpi (mpicxx, mpicc), should it work without adding -lpthread or is >>>>>> it >>>>>> intended because there's another default? In the easybuild log I saw that >>>>>> environment.py sets -lpthread. But if that linking was successful it >>>>>> should >>>>>> just work, shouldn't it? >>>>> >>>>> -- >>>>> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden >>>>> Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90-580 14 >>>>> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se >> >
Re: [easybuild] RE: Intel Compiler / Mpi
I tried a few things and can reproduce this with binutils 2.27 no matter if the intel tools were built with or without eb. With binutils 2.25 and 2.28 it works without complaining about pthread_sigmask. It seems like specific to the behavior of ld of 2.27 (or and 2.26). > Am 18.05.2017 um 09:35 schrieb Åke Sandgren <ake.sandg...@hpc2n.umu.se>: > > Just took a deeper look at this. > > > Evil stuff is happening. > > Take a onefile simple hello-world type mpi program (h.c) > > check mpicxx -o hw h.c -show and mpiicpc -o hw h.c -show outputs > > Now take the mpicxx output and add > -Wl,-t -Wl,-y,pthread_sigmask 2>&1 | less > > Look for sigmask and then compare the same for the mpiicpc command. > > The mpicxx command does: > > /hpc2n/eb/software/Compiler/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/intel64/lib/release_mt/libmpi.so: > reference to pthread_sigmask > /usr/lib/x86_64-linux-gnu/libdl.so > /usr/lib/x86_64-linux-gnu/librt.so > /usr/lib/x86_64-linux-gnu/librt.so: reference to pthread_sigmask > /lib/x86_64-linux-gnu/libpthread.so.0 > /lib/x86_64-linux-gnu/libpthread.so.0: definition of pthread_sigmask > > The mpiicpc does: > > /hpc2n/eb/software/Compiler/intel/2017.1.132-GCC-6.3.0-2.27/impi/2017.1.132/intel64/lib/release_mt/libmpi.so: > reference to pthread_sigmask > -ldl (/usr/lib/x86_64-linux-gnu//libdl.so) > -lrt (/usr/lib/x86_64-linux-gnu//librt.so) > /lib/x86_64-linux-gnu/libpthread.so.0 > /lib/x86_64-linux-gnu/libpthread.so.0: definition of pthread_sigmask > > Notice the difference for the "rt" lib. > > Then take the g++ command and add a -lpthread just after -lmpi, making > the command: > > ... -lmpifort -lmpi -lpthread -lmpigi -ldl -lrt -lpthread ... > > Now the thing works. > I'd say that this is a bug in ld. > > So, if users want to use gcc with intel mpi they have to explicitly add > a -lpthread to the link line. That way ld picks up the pthread defs > before -lmpi gets looked at. > >> On 05/18/2017 08:42 AM, Henkel, Andreas wrote: >> @Ake >> I tried with mpiicpc and it worked. >> I get the undefined reference only when using mpicxx. Using >> mpi/mvapich2/2.2-GCC-6.3.0-2.27 and mpi/openmpi/1.10.3-GCC-5.4.0-2.26 mpicxx >> works. >> This leads me back to the initial question: for people using gcc with >> intelmpi (mpicxx, mpicc), should it work without adding -lpthread or is it >> intended because there's another default? In the easybuild log I saw that >> environment.py sets -lpthread. But if that linking was successful it should >> just work, shouldn't it? > > > -- > Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden > Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90-580 14 > Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
[easybuild] RE: Intel Compiler / Mpi
@Ake I tried with mpiicpc and it worked. I get the undefined reference only when using mpicxx. Using mpi/mvapich2/2.2-GCC-6.3.0-2.27 and mpi/openmpi/1.10.3-GCC-5.4.0-2.26 mpicxx works. This leads me back to the initial question: for people using gcc with intelmpi (mpicxx, mpicc), should it work without adding -lpthread or is it intended because there's another default? In the easybuild log I saw that environment.py sets -lpthread. But if that linking was successful it should just work, shouldn't it? Best, Andreas -Original Message- From: easybuild-requ...@lists.ugent.be [mailto:easybuild-requ...@lists.ugent.be] On Behalf Of Henkel, Andreas Sent: Thursday, May 18, 2017 7:52 AM To: easybuild@lists.ugent.be Subject: [easybuild] Intel Compiler / Mpi Hi, I encounter a problem with the intel compiler. With our old manually built intel tools i can compile a simple mpi-Programm using mpicxx source.c Now, if I try the same using thenintel tools built with easybuild I get complains about undefined references to pthreads. Of course adding -lpthread solves it, I just wonder if this behavior is intended or if I did a mistake on the way installing it. Except for changing the source file name I didn't change the easyconfigs. I used eb 3.1.1 when I installed the version 2017.2.174. Best, Andreas smime.p7s Description: S/MIME cryptographic signature
[easybuild] Intel Compiler / Mpi
Hi, I encounter a problem with the intel compiler. With our old manually built intel tools i can compile a simple mpi-Programm using mpicxx source.c Now, if I try the same using thenintel tools built with easybuild I get complains about undefined references to pthreads. Of course adding -lpthread solves it, I just wonder if this behavior is intended or if I did a mistake on the way installing it. Except for changing the source file name I didn't change the easyconfigs. I used eb 3.1.1 when I installed the version 2017.2.174. Best, Andreas
AW: [easybuild] Software/easyconfigs update avail check
Hi Kenneth, in our current solution we use pmaintcheck [1], which is a lightweight solution but uses a config file in the current version. We have a small wrapper that supplies Nagios with status messages. Anitya looks interesting but much more complicated. I admit that a message aggregator is nice since you'd have one place to see all the available updates - system and eb. But as far as I can see it needs a webserver running or at least a http endpoint. Although I like webservices that allow easy tracking I also know more conservative Sysadmins that do dislike those and would vote for having anitya optional. Hence, currently I would rather go for the easy solution using pmaintcheck -- adding an option-parsing to it -- than for the complete anitya because it doesn't need much of a setup and I don't have too much time at the moment. Best, Andreas [1] https://github.com/mrpdaemon/pmaintcheck _ Von: easybuild-requ...@lists.ugent.be <easybuild-requ...@lists.ugent.be> im Auftrag von Kenneth Hoste <kenneth.ho...@ugent.be> Gesendet: Montag, 8. Mai 2017 09:59:36 An: easybuild@lists.ugent.be Betreff: Re: [easybuild] Software/easyconfigs update avail check Hi Andreas, On 05/05/2017 08:16, Henkel, Andreas wrote: > Hi, > > @Jens: Thank you very much for your detailed reply. I really appreciate it. > > Personally, I don't have a problem with all the easybuild packages since they are not opening up for root. I rather appreciate that users could built up their own eb on out clusterwide installation. And the time saving for installation is great (except if you make a bad choice and choose foss-2017a as default tool chain but now I know much more about eb because of all the modifications I had to make ;-) ) > > There was another interesting point raised about update cycles: what if something is misconfigured in the easyconfigs for blas for example which would lead to wrong results for the users using that lib? I know that this is not directly related to easybuild but wanted to ask anyway... > > About the update checker @Damian pointed to a related pull request (thank you!). Do I get it right that there is no such thing as checking for newer versions of installed software? The point why I'm asking is just because I'm thinking about diving into this. Thinking about something similar to apt update which simply lists available upgrades. Actually, we have some python-code that does this for our installed modules and sends its results to nagios/icinga. This would be more of an update notifier and don't do any action. If there isn't something like that yet I can have a look how to adapt our stuff to eb. Any pointer to a good place in the framework is appreciated. We have plans to leverage Anitya [1] to let EasyBuild query for available versions of a particular software packages. The idea would be that you could easily install the latest version of a particular software package, where EasyBuild figures out what the latest version is for you. For now, this is just an idea though. If you're up for diving into this, we can discuss this further (e.g. via a conf call). regards, Kenneth [1] https://github.com/release-monitoring/anitya smime.p7s Description: S/MIME cryptographic signature
Re: [easybuild] Software/easyconfigs update avail check
Hi, @Jens: Thank you very much for your detailed reply. I really appreciate it. Personally, I don't have a problem with all the easybuild packages since they are not opening up for root. I rather appreciate that users could built up their own eb on out clusterwide installation. And the time saving for installation is great (except if you make a bad choice and choose foss-2017a as default tool chain but now I know much more about eb because of all the modifications I had to make ;-) ) There was another interesting point raised about update cycles: what if something is misconfigured in the easyconfigs for blas for example which would lead to wrong results for the users using that lib? I know that this is not directly related to easybuild but wanted to ask anyway... About the update checker @Damian pointed to a related pull request (thank you!). Do I get it right that there is no such thing as checking for newer versions of installed software? The point why I'm asking is just because I'm thinking about diving into this. Thinking about something similar to apt update which simply lists available upgrades. Actually, we have some python-code that does this for our installed modules and sends its results to nagios/icinga. This would be more of an update notifier and don't do any action. If there isn't something like that yet I can have a look how to adapt our stuff to eb. Any pointer to a good place in the framework is appreciated. Best, Andreas
[easybuild] Software/easyconfigs update avail check
Hi, Recently we started using eb for our new cluster. Yesterday, in our group meeting a question was raised concerning updates and security patches for installed software similar to apt update/upgrade, yum update,...? Or is there a routine to check for newer releases of installed easyconfigs? Best, Andreas Henkel
Re: [easybuild] mpi4py on OpenMPI 2.x (foss/2017a)
I could offer 2.7.12-foss-2017a.eb which I built a week ago - and updated package versions. Andrea > Am 11.04.2017 um 15:52 schrieb Åke Sandgren: > > Or i could give you Python-2.7.12-intel-2017a.eb or > Python-2.7.12-foss-2017a.eb that we already have installed. > >> On 04/11/2017 03:46 PM, Ward Poelmans wrote: >> Hi Joachim, >> >>> On 11-04-17 15:39, Joachim Hein wrote: >>> Hi, >>> >>> I have a user with MPI4PY issues. MPI4PY out of the box wants to have a >>> multi threaded MPI library, which is not supported in OpenMPI 1.x on IB. >>> So building on top of foss/2016b and foss/2016a is not an option. >>> foss/2017a is an option, but there are currently no Python packages for >>> this. Could I get a quick reply on status/issues around Python for >>> foss/2017a and intel/2017a. Is that something wrong in principle, is there >>> something in the making or should I just have a go at moving e.g. the >>> Python config for foss/2016b to foss/2017a? I can engage here, but like to >>> avoid duplication of effort, if someone else is already on the case. >> >> >> There is a 'Python-2.7.13-intel-2017a.eb' easyconfig in develop. Do >> --try-toolchain on that for foss/2017a? >> >> Ward >> > > -- > Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden > Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90-580 14 > Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
RE: [easybuild] Common config: to hide or not to hide
Dear Kenneth, thank you very much for your explanations! You're right, I oversaw the option of environment variables. I think my issue is the transition from the os-dependent manual install to easybuild, since easybuild tries to the self-contained, hence, I see more modules which were "invisible" because they were os-libs before. Probably, it's just me getting used to it. Then I'll stick with having no hidden modules for now and see if our users adapt. There's another sort of related question I have: Do you also have many redundant modules? I mean, I install GCC 6.3.0, which needed zlib. Later I tried to install some other software with try-toolchain and --use-existing but now I see redundant modules, namely zlib-ver-GCCcore-6.3.0 zlib-ver-foss2017a Although foss2017a uses 6.3.0 in my case. My little experience with try-toolchain is that it somehow ignores existing software/modules. Is that correct or did I do something wrong? Best regards, Andreas -Original Message- From: easybuild-requ...@lists.ugent.be [mailto:easybuild-requ...@lists.ugent.be] On Behalf Of Kenneth Hoste Sent: Monday, April 3, 2017 9:14 AM To: easybuild@lists.ugent.be Subject: Re: [easybuild] Common config: to hide or not to hide On 03/04/2017 08:24, Henkel, Andreas wrote: > Dear Easybuilder, > > I'm fairly new to easybuild and have just built several packages using the --try-* options or even adapted easyconfigs. It's a very handy way for installing software, thanks for that! > Beyond that I had a look at the different module naming schemes and ended up with using the categorized nms. > > I'm a bit surprised about the amount of modules I end up with when installing for example GCC or Netcdf. We did manual installs before and used System Libs a lot. Now, every lib ends up being a module and on my first impression this pollutes the module list. I saw the option hide-deps and filter-deps. Using those would remove some of those libs of course but as far as I can see it would also take a way some of the "automagic" since I'd have to specify the options for every install. You don't have to specify --hide-deps of --filter-deps for every installations, you can configure EasyBuild via environment variables or configuration files as well, see http://easybuild.readthedocs.io/en/latest/Configuration.html . It is true that you have to gradually build up the list of modules you want to hide though, since that's a very site-specific aspect. Note that --filter-deps means something very different than --hide-deps. With --filter-deps, you're telling EasyBuild not to install something, and to assume that that particular library/tool is already provided by the OS (which also means you're at the mercy of the OS a little bit more). > I would like to ask for some hints of experienced easybuild users. How are you handling the dependent modules? Are they usually hidden-modules or do the users see all the dependencies in their module environment? Some sites (e.g. JSC) indeed to quite a bit of effort the only expose those modules to users that they are actually interested in, and --hide-deps is a useful way of doing that. This is being discussed in the HUST-16 workshop paper available at http://hpcugent.github.io/easybuild/files/eb-jsc-hust16.pdf . At HPC-UGent, we typically don't hide modules (unless it's a temporary installation that we want some users to evaluate first), and we're not getting too much complaints from our users on that to be honest. That does mean our default 'module avail' view is huge though, but it's less effort on our part to figure out which modules would be useful to users and which ones wouldn't be... regards, Kenneth smime.p7s Description: S/MIME cryptographic signature
[easybuild] Common config: to hide or not to hide
Dear Easybuilder, I'm fairly new to easybuild and have just built several packages using the --try-* options or even adapted easyconfigs. It's a very handy way for installing software, thanks for that! Beyond that I had a look at the different module naming schemes and ended up with using the categorized nms. I'm a bit surprised about the amount of modules I end up with when installing for example GCC or Netcdf. We did manual installs before and used System Libs a lot. Now, every lib ends up being a module and on my first impression this pollutes the module list. I saw the option hide-deps and filter-deps. Using those would remove some of those libs of course but as far as I can see it would also take a way some of the "automagic" since I'd have to specify the options for every install. I would like to ask for some hints of experienced easybuild users. How are you handling the dependent modules? Are they usually hidden-modules or do the users see all the dependencies in their module environment? Best regards, Andreas