ConnectX-3 gets dropped in MOFED 5.1, but MOFED 4.9 LTS will still work for those.
________________________________________ From: easybuild-requ...@lists.ugent.be <easybuild-requ...@lists.ugent.be> on behalf of Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> Sent: Friday, March 15, 2024 13:21 To: easybuild@lists.ugent.be Subject: Re: [easybuild] Failure in UCX-1.14.1-GCCcore-12.3.0.eb when installing foss-2023a.eb Hi Joaquim, I think that old ConnectX-3 are no longer supported with recent Linux kernels (can anyone confirm this?). You may have to use an ancient (insecure) OS version, or stop using the ConnectX-3 adapters altogether :-( /Ole On 3/15/24 12:45, Joaquim Jornet Somoza wrote: > Dear Ake, > > Indeed we have Mellanox IB, > > [:~]$ lspci | grep -i mellanox > > 02:00.0 Network controller: *Mellanox*Technologies MT27500 Family [ConnectX-3] > > > [:~]$ ibstatus > > Infiniband device 'mlx4_0' port 1 status: > > default gid: fe80:0000:0000:0000:7079:9003:0007:f538 > > base lid:0x76 > > sm lid:0x1 > > state: 4: ACTIVE > > phys state:5: LinkUp > > rate:56 Gb/sec (4X FDR) > > link_layer:InfiniBand > > > > > I had no problem installing foss-2023a.eb in another cluster with : > > $lspci | grep -i mellanox > > 86:00.0 Infiniband controller: *Mellanox*Technologies MT27800 Family > [ConnectX-5] > > > > Thank you for any hint ! > > quim > > > Missatge de Åke Sandgren <ake.sandg...@umu.se > <mailto:ake.sandg...@umu.se>> del dia dv., 15 de març 2024 a les 12:18: > > lspci | grep -i mellanox > will show if you have any mellanox devices on the system, some of > these could be used as normal ethernet devices though > > ibstatus > will show if any of those are running in Infiniband mode > > If ibstatus doesn't exist then you probably don't have infiniband, or > at least lack the packages for using it. > > ________________________________________ > From: easybuild-requ...@lists.ugent.be > <mailto:easybuild-requ...@lists.ugent.be> > <easybuild-requ...@lists.ugent.be > <mailto:easybuild-requ...@lists.ugent.be>> on behalf of Joaquim Jornet > Somoza <j.jornet.som...@gmail.com <mailto:j.jornet.som...@gmail.com>> > Sent: Friday, March 15, 2024 11:44 > To: easybuild@lists.ugent.be <mailto:easybuild@lists.ugent.be> > Subject: Re: [easybuild] Failure in UCX-1.14.1-GCCcore-12.3.0.eb when > installing foss-2023a.eb > > Dear Ake, > > How can I check this? > > Thank you! > > El vie, 15 mar 2024, 7:58, Åke Sandgren <ake.sandg...@umu.se > <mailto:ake.sandg...@umu.se><mailto:ake.sandg...@umu.se > <mailto:ake.sandg...@umu.se>>> escribió: > No there is no bug there. > > Which MOFED stack version are you using? > Or does your system lack Infiniband? > > ________________________________________ > From: easybuild-requ...@lists.ugent.be > > <mailto:easybuild-requ...@lists.ugent.be><mailto:easybuild-requ...@lists.ugent.be > <mailto:easybuild-requ...@lists.ugent.be>> <easybuild-requ...@lists.ugent.be > <mailto:easybuild-requ...@lists.ugent.be><mailto:easybuild-requ...@lists.ugent.be > <mailto:easybuild-requ...@lists.ugent.be>>> on behalf of Joaquim Jornet > Somoza <j.jornet.som...@gmail.com > <mailto:j.jornet.som...@gmail.com><mailto:j.jornet.som...@gmail.com > <mailto:j.jornet.som...@gmail.com>>> > Sent: Thursday, March 14, 2024 16:02 > To: easybuild@lists.ugent.be > <mailto:easybuild@lists.ugent.be><mailto:easybuild@lists.ugent.be > <mailto:easybuild@lists.ugent.be>> > Subject: [easybuild] Failure in UCX-1.14.1-GCCcore-12.3.0.eb when > installing foss-2023a.eb > > Dear easybuilders, > > I am trying to install foss-2023a.eb on a RH7.7 servers, but when > installing UCX-1.14.1-GCCcore-12.3.0.eb , the installation fails with > the following error: > ... > libtool: compile: gcc -DHAVE_CONFIG_H -I. -I../../.. > "-DCPU_FLAGS=|avx" > -I/dev/shm/easybuild/UCX/1.14.1/GCCcore-12.3.0/ucx-1.14.1/src > -I/dev/shm/easybuild/UCX/1.14.1/GCCcore-12.3.0/ucx-1.14.1 > -I/dev/shm/easybuild/UCX/1.14.1/GCCcore-12.3.0/ucx-1.14.1/src > > -I/software/easybuild/x86_64/software/numactl/2.0.16-GCCcore-12.3.0/include > -I/software/easybuild/x86_64/software/zlib/1.2.13-GCCcore-12.3.0/include > -I/software/easybuild/x86_64/software/pkgconf/1.9.5-GCCcore-12.3.0/include > -I/software/easybuild/x86_64/software/binutils/2.40-GCCcore-12.3.0/include > -O3 -g -Wall -Werror -mavx -funwind-tables -Wno-missing-field-initializers > -Wno-unused-parameter -Wno-unused-label -Wno-long-long -Wno-endif-labels > -Wno-sign-compare -Wno-multichar -Wno-deprecated-declarations -Winvalid-pch > -Wno-pointer-sign -Werror-implicit-function-declaration > -Wno-format-zero-length -Wnested-externs -Wshadow > -Werror=declaration-after-statement -O2 -ftree-vectorize -march=native > -fno-math-errno -fPIC -MT rc/verbs/libuct_ib_la-rc_verbs_ep.lo -MD -MP -MF > rc/verbs/.deps/libuct_ib_la-rc_verbs_ep.Tpo -c rc/verbs/rc_verbs_ep.c -o > rc/verbs/libuct_ib_la-rc_verbs_ep.o >/dev/null 2>&1 > base/ib_md.c: In function 'uct_ib_md_access_flags': > base/ib_md.c:638:25: error: 'IBV_ACCESS_ON_DEMAND' undeclared (first > use in this function); did you mean 'IBV_EXP_ACCESS_ON_DEMAND'? > 638 | access_flags |= IBV_ACCESS_ON_DEMAND; > | ^~~~~~~~~~~~~~~~~~~~ > | IBV_EXP_ACCESS_ON_DEMAND > base/ib_md.c:638:25: note: each undeclared identifier is reported only > once for each function it appears in > base/ib_md.c: In function 'uct_ib_mem_reg_internal': > base/ib_md.c:751:24: error: 'IBV_ACCESS_ON_DEMAND' undeclared (first > use in this function); did you mean 'IBV_EXP_ACCESS_ON_DEMAND'? > 751 | if (access_flags & IBV_ACCESS_ON_DEMAND) { > | ^~~~~~~~~~~~~~~~~~~~ > | IBV_EXP_ACCESS_ON_DEMAND > base/ib_md.c: In function 'uct_ib_md_global_odp_init': > base/ib_md.c:1449:54: error: 'IBV_ACCESS_ON_DEMAND' undeclared (first > use in this function); did you mean 'IBV_EXP_ACCESS_ON_DEMAND'? > 1449 | UCT_IB_MEM_ACCESS_FLAGS | > IBV_ACCESS_ON_DEMAND, > | > ^~~~~~~~~~~~~~~~~~~~ > | > IBV_EXP_ACCESS_ON_DEMAND > > > Any hint on how to fix it? Is there a bug with IBV_ACCESS_ON_DEMAND > variable? > > > > -- > ---------------------------------------------------------------------------------------------------------------------------------------- > *Dr. Joaquim Jornet Somoza* > *Técnico Superior de Cálculo Científico * > Servicios Generales a la Investigación (*SGIker*) > Universidad del País Vasco (*UPV/EHU*) > email: j.jornet.som...@gmail.com <mailto:j.jornet.som...@gmail.com> > Edificio Joxe Maria Korta (Campus Gipuzkoa) > Av. Tolosa 72, 4a planta > 20018 Donostia-San Sebastián, > Gipuzkoa, Spain > > /External Collaborator./ > Nano-Bio Spectroscopy group > Departamento de Física de Materiales > Universidad del País Vasco (UPV/EHU) > Donostia-San Sebastián, Gipuzkoa, Spain > > The Max Planck Institute for the Structure and Dynamics of Matter (MPSD) > Bldg. 99 (CFEL) > Luruper Chaussee 149 > 22761 Hamburg, Germany -- Ole Holm Nielsen PhD, Senior HPC Officer Department of Physics, Technical University of Denmark,