Hi I compiled openmpi-3.1.2 using pgi 18.10 on our IBM power9 machine. After that, I used openmpi-3.1.2 to compile parallel-netcdf-1.8.1. However, I got the following error.
./nc_test -c -d . [c699login01:12104] mca_base_component_repository_open: unable to open mca_plm_lsf: libbat.so: cannot open shared object file: No such file or directory (ignored) [c699login01:12104] mca_base_component_repository_open: unable to open mca_ras_lsf: libbat.so: cannot open shared object file: No such file or directory (ignored) -------------------------------------------------------------------------- WARNING: There are more than one active ports on host 'c699login01', but the default subnet GID prefix was detected on more than one of these ports. If these ports are connected to different physical IB networks, this configuration will fail in Open MPI. This version of Open MPI requires that every physically separate IB subnet that is used between connected MPI processes must have different subnet ID values. Please see this FAQ entry for more details: http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid NOTE: You can turn off this warning by setting the MCA parameter btl_openib_warn_default_gid_prefix to 0. -------------------------------------------------------------------------- [c699login01:12103] *** Process received signal *** [c699login01:12103] Signal: Segmentation fault (11) [c699login01:12103] Signal code: (3) [c699login01:12103] Failing at address: 0x615f6c61706f0064 [c699login01:12103] [ 0] [0x2000000504d8] [c699login01:12103] [ 1] [0x34333164] [c699login01:12103] [ 2] /lib64/libc.so.6(__sbrk+0x98)[0x200000729b28] [c699login01:12103] [ 3] /lib64/libc.so.6(__default_morecore+0x18)[0x2000006aece8] [c699login01:12103] [ 4] /lib64/libc.so.6(+0x9511c)[0x2000006a511c] [c699login01:12103] [ 5] /lib64/libc.so.6(+0x96ff4)[0x2000006a6ff4] [c699login01:12103] [ 6] /lib64/libc.so.6(__libc_malloc+0x8c)[0x2000006a938c] [c699login01:12103] [ 7] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libopen-pal.so.40(opal_show_help_yylex+0x98)[0x20000099a8e0] [c699login01:12103] [ 8] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libopen-pal.so.40(opal_show_help_vstring+0x25c)[0x20000099a2f4] [c699login01:12103] [ 9] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libopen-rte.so.40(orte_show_help+0x70)[0x200000838f48] [c699login01:12103] [10] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/openmpi/mca_btl_openib.so(+0x160d4)[0x2000036060d4] For detail about error, please see the attached file. Best, Zhifeng
make -C test testing make[1]: Entering directory `/autofs/home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/test' make -w -C common testing make[2]: Entering directory `/autofs/home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/test/common' make[2]: Nothing to be done for `testing'. make[2]: Leaving directory `/autofs/home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/test/common' make -w -C nc_test testing make[2]: Entering directory `/autofs/home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/test/nc_test' /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o nc_test nc_test.o error.o util.o test_get.o test_put.o test_iget.o test_iput.o test_read.o test_write.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o t_nc t_nc.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o tst_misc tst_misc.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o tst_norm tst_norm.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o tst_small tst_small.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o tst_names tst_names.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o tst_atts3 tst_atts3.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o tst_atts tst_atts.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm /home/vy57456/application/pgi/18.10/openmpi-3.1.2/bin/mpicc -g -o tst_nofill tst_nofill.o -L../common -L/home/vy57456/application/pgi/18.10/netcdf-4.6.1/lib /home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/src/lib/libpnetcdf.a -ltestutils -lm rm -f ./scratch.nc rm -f ./testfile.nc rm -f ./tooth-fairy.nc ./nc_test -c -d . [c699login01:12104] mca_base_component_repository_open: unable to open mca_plm_lsf: libbat.so: cannot open shared object file: No such file or directory (ignored) [c699login01:12104] mca_base_component_repository_open: unable to open mca_ras_lsf: libbat.so: cannot open shared object file: No such file or directory (ignored) -------------------------------------------------------------------------- WARNING: There are more than one active ports on host 'c699login01', but the default subnet GID prefix was detected on more than one of these ports. If these ports are connected to different physical IB networks, this configuration will fail in Open MPI. This version of Open MPI requires that every physically separate IB subnet that is used between connected MPI processes must have different subnet ID values. Please see this FAQ entry for more details: http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid NOTE: You can turn off this warning by setting the MCA parameter btl_openib_warn_default_gid_prefix to 0. -------------------------------------------------------------------------- [c699login01:12103] *** Process received signal *** [c699login01:12103] Signal: Segmentation fault (11) [c699login01:12103] Signal code: (3) [c699login01:12103] Failing at address: 0x615f6c61706f0064 [c699login01:12103] [ 0] [0x2000000504d8] [c699login01:12103] [ 1] [0x34333164] [c699login01:12103] [ 2] /lib64/libc.so.6(__sbrk+0x98)[0x200000729b28] [c699login01:12103] [ 3] /lib64/libc.so.6(__default_morecore+0x18)[0x2000006aece8] [c699login01:12103] [ 4] /lib64/libc.so.6(+0x9511c)[0x2000006a511c] [c699login01:12103] [ 5] /lib64/libc.so.6(+0x96ff4)[0x2000006a6ff4] [c699login01:12103] [ 6] /lib64/libc.so.6(__libc_malloc+0x8c)[0x2000006a938c] [c699login01:12103] [ 7] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libopen-pal.so.40(opal_show_help_yylex+0x98)[0x20000099a8e0] [c699login01:12103] [ 8] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libopen-pal.so.40(opal_show_help_vstring+0x25c)[0x20000099a2f4] [c699login01:12103] [ 9] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libopen-rte.so.40(orte_show_help+0x70)[0x200000838f48] [c699login01:12103] [10] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/openmpi/mca_btl_openib.so(+0x160d4)[0x2000036060d4] [c699login01:12103] [11] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/openmpi/mca_btl_openib.so(+0x121dc)[0x2000036021dc] [c699login01:12103] [12] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libopen-pal.so.40(mca_btl_base_select+0x1f8)[0x2000009a4910] [c699login01:12103] [13] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/openmpi/mca_bml_r2.so(mca_bml_r2_component_init+0x28)[0x200003594240] [c699login01:12103] [14] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libmpi.so.40(mca_bml_base_init+0xd0)[0x200000240138] [c699login01:12103] [15] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libmpi.so.40(ompi_mpi_init+0xa20)[0x2000001e8858] [c699login01:12103] [16] /home/vy57456/application/pgi/18.10/openmpi-3.1.2/lib/libmpi.so.40(MPI_Init+0xac)[0x20000021daa4] [c699login01:12103] [17] ./nc_test[0x10003be4] [c699login01:12103] [18] /lib64/libc.so.6(+0x25100)[0x200000635100] [c699login01:12103] [19] /lib64/libc.so.6(__libc_start_main+0xc4)[0x2000006352f4] [c699login01:12103] *** End of error message *** [c699login01:12104] 1 more process has sent help message help-mpi-btl-openib.txt / default subnet prefix [c699login01:12104] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages make[2]: *** [testing] Segmentation fault make[2]: Leaving directory `/autofs/home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/test/nc_test' make[1]: *** [check-nc_test] Error 2 make[1]: Leaving directory `/autofs/home/vy57456/application/pgi/source_code/parallel-netcdf-1.8.1/test' make: *** [check] Error 2
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users