[OMPI devel] Build fails for Git versions (master and v4.0.x)
Hello! After I ran into problems with a self-compiled OpenMPI 4.0.1 and CP2K ('make test' fails for the latter and also a couple of input files are dysfunctional with the MPI version), I though it might help to give the Git version of OpenMPI a try. However, I can build neither 'v4.0.x' (673ddae) nor 'master' (7b7ad5e). Both fail during the linking of 'libopen-pal.so'. Is this expected? The error for 'master' is ('v4.0.x' shows a different line number in the Makefile): > make[2]: Entering directory > '/dev/shm/Setup/build/openmpi-git/opal/tools/wrappers' > CC opal_wrapper.o > CCLD opal_wrapper > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_crs_none_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_reachable_netlink_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_pstat_linux_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_shmem_posix_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_btl_tcp_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_patcher_overwrite_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_btl_uct_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_allocator_bucket_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_shmem_sysv_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_pmix_isolated_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_btl_vader_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_shmem_mmap_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_pmix_pmix4x_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_btl_self_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_allocator_basic_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_rcache_grdma_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_mpool_hugepage_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_btl_sm_component' > ../../../opal/.libs/libopen-pal.so: undefined reference to > `mca_reachable_weighted_component' > collect2: error: ld returned 1 exit status > Makefile:1836: recipe for target 'opal_wrapper' failed Software used: - automake (GNU automake) 1.15 - m4 (GNU M4) 1.4.18 - autoconf (GNU Autoconf) 2.69 - libtoolize (GNU libtool) 2.4.6 - flex 2.6.1 - gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 - UCT version=1.5.1 revision 7e67a4b Build process: > $ git clone … ompi; git checkout $BRANCH > $ cd ompi > $ ./autogen.pl &> auto.log > $ ./configure --prefix=$DIR --disable-timing --disable-mpi-cxx > --enable-shared --enable-weak-symbols --enable-binaries --enable-mpi > --enable-mpi-interface-warning --enable-mpi-fortran --enable-c11-atomics > --enable-builtin-atomics --enable-fast-install --enable-mpi1-compatibility > --without-cuda --without-verbs --with-ucx=${PATH_TO_UCX} --disable-debug > --disable-mem-debug &> configure.log > $ make -j 8 &> make.log I also tried a serial build to avoid potential races in the build process but to no avail. The respective log files are attached in compressed form and, for your convenience, also available online auto.log -> https://pastebin.com/2w5RDNdc configure.log -> https://pastebin.com/chWtk4pw make.log -> https://pastebin.com/kYWscGYD As a side question: Are there any functionality tests for OpenMPI in the sense that they check whether communication works properly, i.e. no lost messages, message contents unchanged, …? Regards, Jan ompi-master.tar.bz2 Description: Binary data ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel
Re: [OMPI devel] Build fails for Git versions (master and v4.0.x)
On 31.07.19 22:12, Jeff Squyres (jsquyres) wrote: > Just to make sure you're not dealing with anything left over from and old / > stale build: > > cd top-of-source-tree > git clean -dfx > ./autogen.pl |& tee auto.out > ./configure ... |& tee config.out > make V=1 -j 8 |& tee make.out Thanks a lot. This completely fixed those build problems. I used 'git clean -df' (without x) before and could have sworn I also tried a fresh clone … well, obviously I hadn't. Any suggestions for my question about a test suite for (Open)MPI that also covers correct communication? It would be great to have some way to check my setup “layer by layer”. Regards, Jan ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel
Re: [OMPI devel] Build fails for Git versions (master and v4.0.x)
On 31.07.19 23:54, Jeff Squyres (jsquyres) wrote: > We don't really have any test suites that just test, for example, the > BTLs. We usually rely on the usual MPI benchmarks and test suites > (e.g., the Intel MPI benchmarks have a correctness-checking mode). I guess I'll also move in this direction. Thanks again for your help! Regards, Jan ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel
[OMPI devel] Debug options break build
Switching on various debug options, my builds of OpenMPI with UCX fail (and his time I made sure it's not due to my own stupidity … I hope). The problematic options and respective compiler errors are '--enable-timing' Making all in mca/ess/pmi make[2]: Entering directory '/dev/shm/openmpi-4.0.2rc2/build/orte/mca/ess/pmi' CC ess_pmi_component.lo CC ess_pmi_module.lo In file included from ../../../../../orte/mca/ess/pmi/ess_pmi_module.c:57: ../../../../../orte/mca/ess/pmi/ess_pmi_module.c: In function ‘rte_init’: ../../../../../orte/mca/ess/pmi/ess_pmi_module.c:467:26: error: ‘ess_base_setup’ undeclared (first use in this function); did you mean ‘event_base_set’? OPAL_TIMING_ENV_NEXT(ess_base_setup, "state_framework_open"); ^~ ../../../../../opal/util/timings.h:103:13: note: in definition of macro ‘OPAL_TIMING_ENV_NEXT’ if( h->enabled ){ \ ^ ../../../../../orte/mca/ess/pmi/ess_pmi_module.c:467:26: note: each undeclared identifier is reported only once for each function it appears in OPAL_TIMING_ENV_NEXT(ess_base_setup, "state_framework_open"); ^~ ../../../../../opal/util/timings.h:103:13: note: in definition of macro ‘OPAL_TIMING_ENV_NEXT’ if( h->enabled ){ \ ^ make[2]: *** [Makefile:1857: ess_pmi_module.lo] Error 1 make[2]: Leaving directory '/dev/shm/openmpi-4.0.2rc2/build/orte/mca/ess/pmi' and '--enable-mem-debug' Making all in profile make[3]: Entering directory '/dev/shm/ompi/build/oshmem/shmem/c/profile' LN_S pshmem_init.c LN_S pshmem_finalize.c […] CC pshmem_put.lo CC pshmem_g.lo pshmem_free.c: In function ‘_shfree’: pshmem_free.c:65:39: error: macro "free" passed 2 arguments, but takes just 1 rc = s->allocator->free(s, ptr); ^ pshmem_free.c:65:12: warning: assignment to ‘int’ from ‘int (*)(map_segment_t *, void *)’ {aka ‘int (*)(struct map_segment *, void *)’} makes integer from pointer without a cast [-Wint-conversion] rc = s->allocator->free(s, ptr); ^ make[3]: *** [Makefile:1964: pshmem_free.lo] Error 1 make[3]: *** Waiting for unfinished jobs pshmem_realloc.c: In function ‘_shrealloc’: pshmem_realloc.c:59:56: error: macro "realloc" passed 4 arguments, but takes just 2 rc = s->allocator->realloc(s, size, ptr, &pBuff); ^ pshmem_realloc.c:59:12: warning: assignment to ‘int’ from ‘int (*)(map_segment_t *, size_t, void *, void **)’ {aka ‘int (*)(struct map_segment *, long unsigned int, void *, void **)’} makes integer from pointer without a cast [-Wint-conversion] rc = s->allocator->realloc(s, size, ptr, &pBuff); ^ make[3]: *** [Makefile:1964: pshmem_realloc.lo] Error 1 make[3]: Leaving directory '/dev/shm/ompi/build/oshmem/shmem/c/profile' Preparing this report, I just noticed that the '--enable-timing' bug has already been fixed on 'master' with commit 8e7d874e14a5485dceff836419e36b6b24a66f48. Would be nice if this could make it into the 'v4.0.x' branch. Software used: - automake (GNU automake) 1.16.1 - m4 (GNU M4) 1.4.18 - autoconf (GNU Autoconf) 2.69 - libtoolize (GNU libtool) 2.4.6 - flex 2.6.4 - gcc (Debian 8.3.0-6) 8.3.0 - UCT version=1.6.1-rc2 Build process: $ git clone https://github.com/open-mpi/ompi.git $ cd ompi $ ./autogen.pl &> auto.log $ ./configure --prefix=${DIR} --with-ucx=${PATH_TO_UCX} --enable-mem-debug &> configure.log $ make -j 8 &> make.log The respective log files are attached in compressed form and, for your convenience, also available online auto.log -> https://pastebin.com/cysbi3Vx configure.log -> https://pastebin.com/rEcngh6D make.log -> https://pastebin.com/HMETcSVA Regards, Jan logs.tar.bz2 Description: Binary data ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel
Re: [OMPI devel] Debug options break build
On 19.09.19 22:40, Jeff Squyres (jsquyres) wrote: I am unable to reproduce these issues on master HEAD; assumedly they have something to do with UCX...? I filed https://github.com/open-mpi/ompi/issues/6995 to track the issue. Yes, builds using '--enable-mem-debug' fail only when they also involve UCX. Sorry for not pointing that out explicitly. '--enable-timing' breaks regardless of UCX. As mentioned this has already been fixed in 'master' (commit 8e7d874e14a5485dceff836419e36b6b24a66f48). It would be great to also have this fix in 'v4.0.x'. Regards, Jan ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel