Re: [OMPI devel] [mpich-discuss] ROMIO+Lustre problems in OpenMPI 1.8.3
mpi/mca/io/romio/romio's config.log: -- configure:20962: checking lustre/lustre_user.h usability configure:20962: icc -std=c99 -c -DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64 -finline-functions -fno-strict-aliasing -restrict -fexceptions -Qoption,cpp,--extended_float_types -pthread -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include conftest.c >&5 /usr/include/sys/quota.h(221): error: identifier "caddr_t" is undefined caddr_t __addr) __THROW; ^ compilation aborted for conftest.c (code 2) configure:20962: $? = 2 configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "ROMIO" | #define PACKAGE_TARNAME "romio" | #define PACKAGE_VERSION "Open MPI" | #define PACKAGE_STRING "ROMIO Open MPI" | #define PACKAGE_BUGREPORT "disc...@mpich.org" | #define PACKAGE_URL "http://www.mpich.org/"; | #define PACKAGE "romio" | #define VERSION "Open MPI" | #define STDC_HEADERS 1 | #define HAVE_SYS_TYPES_H 1 | #define HAVE_SYS_STAT_H 1 | #define HAVE_STDLIB_H 1 | #define HAVE_STRING_H 1 | #define HAVE_MEMORY_H 1 | #define HAVE_STRINGS_H 1 | #define HAVE_INTTYPES_H 1 | #define HAVE_STDINT_H 1 | #define HAVE_UNISTD_H 1 | #define HAVE_DLFCN_H 1 | #define LT_OBJDIR ".libs/" | #define HAVE_MPI_OFFSET 1 | #define HAVE_MEMALIGN 1 | #define HAVE_UNISTD_H 1 | #define HAVE_FCNTL_H 1 | #define HAVE_MALLOC_H 1 | #define HAVE_STDDEF_H 1 | #define HAVE_SYS_TYPES_H 1 | #define u_char unsigned char | #define u_short unsigned short | #define u_int unsigned int | #define u_long unsigned long | #define SIZEOF_INT 4 | #define SIZEOF_VOID_P 8 | #define INT_LT_POINTER 1 | #define HAVE_INT_LT_POINTER 1 | #define SIZEOF_LONG_LONG 8 | #define HAVE_LONG_LONG_64 1 | #define HAVE_MPI_LONG_LONG_INT 1 | #define HAVE_MPI_INFO 1 | #define ROMIO_NFS 1 | #define ROMIO_UFS 1 | #define ROMIO_TESTFS 1 | /* end confdefs.h. */ | #include | #ifdef HAVE_SYS_TYPES_H | # include | #endif | #ifdef HAVE_SYS_STAT_H | # include | #endif | #ifdef STDC_HEADERS | # include | # include | #else | # ifdef HAVE_STDLIB_H | # include | # endif | #endif | #ifdef HAVE_STRING_H | # if !defined STDC_HEADERS && defined HAVE_MEMORY_H | # include | # endif | # include | #endif | #ifdef HAVE_STRINGS_H | # include | #endif | #ifdef HAVE_INTTYPES_H | # include | #endif | #ifdef HAVE_STDINT_H | # include | #endif | #ifdef HAVE_UNISTD_H | # include | #endif | #include configure:20962: result: no configure:20962: checking lustre/lustre_user.h presence configure:20962: icc -std=c99 -E -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include conftest.c configure:20962: $? = 0 configure:20962: result: yes configure:20962: WARNING: lustre/lustre_user.h: present but cannot be compiled configure:20962: WARNING: lustre/lustre_user.h: check for missing prerequisite headers? configure:20962: WARNING: lustre/lustre_user.h: see the Autoconf documentation configure:20962: WARNING: lustre/lustre_user.h: section "Present But Cannot Be Compiled" configure:20962: WARNING: lustre/lustre_user.h: proceeding with the compiler's result configure:20962: checking for lustre/lustre_user.h configure:20962: result: no configure:20971: error: LUSTRE support requested but cannot find lustre/lustre_user.h header file -- ___ discuss mailing list disc...@mpich.org To manage subscription options or unsubscribe: https://lists.mpich.org/mailman/listinfo/discuss -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA
Re: [OMPI devel] [mpich-discuss] BUG in ADIOI_NFS_WriteStrided
D7CFB: SYEnveloppeMessage PAIO::__ecritureIndexeParBlocMPI<__PAIOType, PtrPorteurConst, FunctorCopieInfosSurDansVectPA__Type, std::vector*, std::allocator*> > const>, FunctorAccesseurPorteurLocal<__PtrPorteurConst > >(PAGroupeProcessus&, ompi_file_t*, long long, PtrPorteurConst, PtrPorteurConst, FunctorCopieInfosSurDansVectPA__Type, std::vector*, std::allocator*> > const>&, FunctorAccesseurPorteurLocal<__PtrPorteurConst >&, long, DistributionComposantes&, long, unsigned long, unsigned long, std::string const&) (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211==by 0x4E9A67: GISLectureEcriture::__visiteMaillage(Maillage const&) (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211==by 0x4C79A2: GISLectureEcriture::__ecritGISMPI(std::string, GroupeInfoSur const&, std::string const&) (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211==by 0x4961AD: main (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211== Uninitialised value was created by a heap allocation ==19211==at 0x4C2C27B: malloc (in /usr/lib64/valgrind/vgpreload___memcheck-amd64-linux.so) ==19211==by 0x2745E78E: ADIOI_Malloc_fn (malloc.c:50) ==19211==by 0x2743757C: ADIOI_NFS_WriteStrided (ad_nfs_write.c:497) ==19211==by 0x27451963: ADIOI_GEN_WriteStridedColl (ad_write_coll.c:159) ==19211==by 0x274321BD: MPIOI_File_write_all_begin (write_allb.c:114) ==19211==by 0x27431DBF: mca_io_romio_dist_MPI_File___write_all_begin (write_allb.c:44) ==19211==by 0x2742A367: mca_io_romio_file_write_all___begin (io_romio_file_write.c:264) ==19211==by 0x12126520: PMPI_File_write_all_begin (pfile_write_all_begin.c:74) ==19211==by 0x4D7CFB: SYEnveloppeMessage PAIO::__ecritureIndexeParBlocMPI<__PAIOType, PtrPorteurConst, FunctorCopieInfosSurDansVectPA__Type, std::vector*, std::allocator*> > const>, FunctorAccesseurPorteurLocal<__PtrPorteurConst > >(PAGroupeProcessus&, ompi_file_t*, long long, PtrPorteurConst, PtrPorteurConst, FunctorCopieInfosSurDansVectPA__Type, std::vector*, std::allocator*> > const>&, FunctorAccesseurPorteurLocal<__PtrPorteurConst >&, long, DistributionComposantes&, long, unsigned long, unsigned long, std::string const&) (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211==by 0x4E9A67: GISLectureEcriture::__visiteMaillage(Maillage const&) (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211==by 0x4C79A2: GISLectureEcriture::__ecritGISMPI(std::string, GroupeInfoSur const&, std::string const&) (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211==by 0x4961AD: main (in /home/mefpp_ericc/GIREF/bin/__Test.LectureEcritureGISMPI.__opt) ==19211== Can't tell if it is a big issue or not, but I thought I should mention it to the list We run without this valgrind error when I use my local disk partition instead of an nfs parition or if I run with only 1 process (which always have something to write for each PMPI_File_write_all_begin) and write to an nfs partition. Using openmpi-1.8.4rc3 compiled in "debug" mode: ompi_info -all : http://www.giref.ulaval.ca/~__ericc/ompi_bug/ompi_info.all.__184rc3.txt.gz <http://www.giref.ulaval.ca/~ericc/ompi_bug/ompi_info.all.184rc3.txt.gz> config.log: http://www.giref.ulaval.ca/~__ericc/ompi_bug/config.184rc3.__log.gz <http://www.giref.ulaval.ca/~ericc/ompi_bug/config.184rc3.log.gz> Thanks, Eric _ devel mailing list de...@open-mpi.org <mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/__mailman/listinfo.cgi/devel <http://www.open-mpi.org/mailman/listinfo.cgi/devel> Link to this post: http://www.open-mpi.org/__community/lists/devel/2014/12/__16691.php <http://www.open-mpi.org/community/lists/devel/2014/12/16691.php> ___ discuss mailing list disc...@mpich.org To manage subscription options or unsubscribe: https://lists.mpich.org/mailman/listinfo/discuss -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA
Re: [OMPI devel] [mpich-discuss] ROMIO+Lustre problems in OpenMPI 1.8.3
On 11/07/2014 06:26 AM, Ralph Castain wrote: Hi Rob Following up on this: I cannot find any reference to XOPEN_SOURCE in our included ROMIO source for Lustre. I only found one reference anywhere in ROMIO: romio/adio/ad_xfs/ad_xfs.h:11:#define _XOPEN_SOURCE 500 Any other suggestions on what could be causing the problem? I've fixed this in ROMIO by not mucking around with XOPEN_SOURCE at all, in either lustre or xfs or anywhere. http://git.mpich.org/mpich.git/commit/4e80e1d2b and http://git.mpich.org/mpich.git/commit/5a10283bf7 ==rob Thanks Ralph On Oct 28, 2014, at 7:32 AM, Rob Latham wrote: On 10/28/2014 06:00 AM, Paul Kapinos wrote: Dear Open MPI and ROMIO developer, We use Open MPI v.1.6.x and 1.8.x in our cluster. We have Lustre file system; we wish to use MPI_IO. So the OpenMPI's are compiled with this flag: --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre' In our newest installation openmpi/1.8.3 we found that MPI_IO is *broken*. Short seek for root of the evil bring the following to light: - the ROMIO component 'MCA io: romio' isn't here at all in the affected version, because - configure of ROMIO has *failed* (cf. logs (a,b,c). - because lustre_user.h was found but could not be compiled. lustre_user.h cannot be compiled because quota defines won't compile. Ugh, what a mess. A while back I noticed this and fixed it by removing an XOPEN_SOURCE feature test macro: http://trac.mpich.org/projects/mpich/ticket/1973 Then, on solaris with --enable-strict we needed to put *back* the XOPEN_SOURCE macro or else pread and pwrite would be undefined. So what I really need to to is delete XOPEN_SOURCE since it causes such headaches, and on the rare platforms that only have pread/pwrite defined if you take extraordinary measures, if at all, I'll have a ROMIO pread and pwrite that simply do seek + write (or read). For now, please delete the XOPEN_SOURCE line at the very beginning of src/mpi/romio/adio/ad_lustre/ad_lustre_rwcontig.c ==rob In our system, there are two lustre_user.h available: $ locate lustre_user.h /usr/include/linux/lustre_user.h /usr/include/lustre/lustre_user.h As I'm not very convinient with lustre, I just attach both of them. pk224850@cluster:~[509]$ uname -a Linux cluster.rz.RWTH-Aachen.DE 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux pk224850@cluster:~[510]$ cat /etc/issue Scientific Linux release 6.5 (Carbon) Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our environment. Best Paul Kapinos P.S. Is there a confugure flag, which will enforce ROMIO? That is when ROMIO not available, configure would fail. This would make such hidden errors publique at installation time.. a) Log in Open MPI's config.log: -- configure:226781: OMPI configuring in ompi/mca/io/romio/romio configure:226866: running /bin/sh './configure' --with-file-system=testfs+ufs+nfs+lustre FROM_OMPI=yes CC="icc -std=c99" CFLAGS="-DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64 -finline-functions -fno-strict-aliasing -restrict -fexceptions -Qoption,cpp,--extended_float_types -pthread" CPPFLAGS=" -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent -I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include" FFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64 " LDFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64 -fexceptions " --enable-shared --disable-static --with-file-system=testfs+ufs+nfs+lustre --prefix=/opt/MPI/openmpi-1.8.3/linux/intel --disable-aio --cache-file=/dev/null --srcdir=. --disable-option-checking configure:226876: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio configure:226911: WARNING: ROMIO distribution did not configure successfully configure:227425: checking if MCA component io:romio can compile configure:227427: result: no -- b) dump of Open MPI's 'configure' output to the console: -- checking lustre/lustre_user.h usability... no checking lustre/lustre_user.h presence... yes configure: WARNING: lustre/lustre_user.h: present but cannot be compiled configure: WARNING: lustre/lustre_user.h: check for missing prerequisite headers? configure: WARNING: lustre/lustre_user.h: see the Autoconf documentation configure: WARNING: lustre/lustre_user.h: section "Present But Cannot Be Compiled" configure: WARNING: lustre/lustre_user.h: proceeding with the compiler