While I was working on something else, I let the tests run with Open MPI master 
(which is for parallel I/O equivalent to the upcoming v4.0.1  release), and 
here is what I found for the HDF5 1.10.4 tests on my local desktop:

In the testpar directory, there is in fact one test that fails for both ompio 
and romio321 in exactly the same manner.
I used 6 processes as you did (although I used mpirun directly  instead of 
srun...) From the 13 tests in the testpar directory, 12 pass correctly 
(t_bigio, t_cache, t_cache_image, testphdf5, t_filters_parallel, t_init_term, 
t_mpi, t_pflush2, t_pread, t_prestart, t_pshutdown, t_shapesame). 

The one tests that officially fails ( t_pflush1) actually reports that it 
passed, but then throws message that indicates that MPI_Abort has been called, 
for both ompio and romio. I will try to investigate this test to see what is 
going on.

That being said, your report shows an issue in t_mpi, which passes without 
problems for me. This is however not GPFS, this was an XFS local file system. 
Running the tests on GPFS are on my todo list as well.

Thanks
Edgar



> -----Original Message-----
> From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of
> Gabriel, Edgar
> Sent: Sunday, February 17, 2019 10:34 AM
> To: Open MPI Users <users@lists.open-mpi.org>
> Subject: Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI
> 3.1.3
> 
> I will also run our testsuite and the HDF5 testsuite on GPFS, I have access 
> to a
> GPFS file system since recently, and will report back on that, but it will 
> take a
> few days.
> 
> Thanks
> Edgar
> 
> > -----Original Message-----
> > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of
> > Ryan Novosielski
> > Sent: Sunday, February 17, 2019 2:37 AM
> > To: users@lists.open-mpi.org
> > Subject: Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI
> > 3.1.3
> >
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > This is on GPFS. I'll try it on XFS to see if it makes any difference.
> >
> > On 2/16/19 11:57 PM, Gilles Gouaillardet wrote:
> > > Ryan,
> > >
> > > What filesystem are you running on ?
> > >
> > > Open MPI defaults to the ompio component, except on Lustre
> > > filesystem where ROMIO is used. (if the issue is related to ROMIO,
> > > that can explain why you did not see any difference, in that case,
> > > you might want to try an other filesystem (local filesystem or NFS
> > > for example)\
> > >
> > >
> > > Cheers,
> > >
> > > Gilles
> > >
> > > On Sun, Feb 17, 2019 at 3:08 AM Ryan Novosielski
> > > <novos...@rutgers.edu> wrote:
> > >>
> > >> I verified that it makes it through to a bash prompt, but I’m a
> > >> little less confident that something make test does doesn’t clear it.
> > >> Any recommendation for a way to verify?
> > >>
> > >> In any case, no change, unfortunately.
> > >>
> > >> Sent from my iPhone
> > >>
> > >>> On Feb 16, 2019, at 08:13, Gabriel, Edgar
> > >>> <egabr...@central.uh.edu>
> > >>> wrote:
> > >>>
> > >>> What file system are you running on?
> > >>>
> > >>> I will look into this, but it might be later next week. I just
> > >>> wanted to emphasize that we are regularly running the parallel
> > >>> hdf5 tests with ompio, and I am not aware of any outstanding items
> > >>> that do not work (and are supposed to work). That being said, I
> > >>> run the tests manually, and not the 'make test'
> > >>> commands. Will have to check which tests are being run by that.
> > >>>
> > >>> Edgar
> > >>>
> > >>>> -----Original Message----- From: users
> > >>>> [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gilles
> > >>>> Gouaillardet Sent: Saturday, February 16, 2019 1:49 AM To: Open
> > >>>> MPI Users <users@lists.open-mpi.org> Subject: Re:
> > >>>> [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI
> > >>>> 3.1.3
> > >>>>
> > >>>> Ryan,
> > >>>>
> > >>>> Can you
> > >>>>
> > >>>> export OMPI_MCA_io=^ompio
> > >>>>
> > >>>> and try again after you made sure this environment variable is
> > >>>> passed by srun to the MPI tasks ?
> > >>>>
> > >>>> We have identified and fixed several issues specific to the
> > >>>> (default) ompio component, so that could be a valid workaround
> > >>>> until the next release.
> > >>>>
> > >>>> Cheers,
> > >>>>
> > >>>> Gilles
> > >>>>
> > >>>> Ryan Novosielski <novos...@rutgers.edu> wrote:
> > >>>>> Hi there,
> > >>>>>
> > >>>>> Honestly don’t know which piece of this puzzle to look at or how
> > >>>>> to get more
> > >>>> information for troubleshooting. I successfully built HDF5
> > >>>> 1.10.4 with RHEL system GCC 4.8.5 and OpenMPI 3.1.3. Running the
> > >>>> “make check” in HDF5 is failing at the below point; I am using a
> > >>>> value of RUNPARALLEL='srun -- mpi=pmi2 -p main -t
> > >>>> 1:00:00 -n6 -N1’ and have a SLURM that’s otherwise properly
> > >>>> configured.
> > >>>>>
> > >>>>> Thanks for any help you can provide.
> > >>>>>
> > >>>>> make[4]: Entering directory
> > >>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build-
> > >>>> gcc-4.8-openmpi-3.1.3/testpar'
> > >>>>> ============================ Testing  t_mpi
> > >>>>> ============================ t_mpi  Test Log
> > >>>>> ============================ srun: job 84126610 queued and
> > waiting
> > >>>>> for resources srun: job 84126610 has been allocated resources
> > >>>>> srun: error: slepner023: tasks 0-5: Alarm clock 0.01user
> > >>>>> 0.00system 20:03.95elapsed 0%CPU (0avgtext+0avgdata
> > >>>>> 5152maxresident)k 0inputs+0outputs
> (0major+1529minor)pagefaults
> > >>>>> 0swaps make[4]: *** [t_mpi.chkexe_] Error 1 make[4]: Leaving
> > >>>>> directory
> > >>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build-
> > >>>> gcc-4.8-openmpi-3.1.3/testpar'
> > >>>>> make[3]: *** [build-check-p] Error 1 make[3]: Leaving directory
> > >>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build-
> > >>>> gcc-4.8-openmpi-3.1.3/testpar'
> > >>>>> make[2]: *** [test] Error 2 make[2]: Leaving directory
> > >>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build-
> > >>>> gcc-4.8-openmpi-3.1.3/testpar'
> > >>>>> make[1]: *** [check-am] Error 2 make[1]: Leaving directory
> > >>>>> `/scratch/novosirj/install-files/hdf5-1.10.4-build-
> > >>>> gcc-4.8-openmpi-3.1.3/testpar'
> > >>>>> make: *** [check-recursive] Error 1
> > >>>>>
> > >>>>> -- ____ || \\UTGERS,
> > >>>>> |---------------------------*O*---------------------------
> > >>>>> ||_// the State     |         Ryan Novosielski -
> > >>>>> novos...@rutgers.edu || \\ University | Sr. Technologist -
> > >>>>> 973/972.0922 (2x0922) ~*~ RBHS Campus ||  \\    of NJ     |
> > >>>>> Office of Advanced Research Computing - MSB C630, Newark `'
> > >>>> _______________________________________________ users
> mailing
> > list
> > >>>> users@lists.open-mpi.org
> > >>>> https://lists.open-mpi.org/mailman/listinfo/users
> > >>> _______________________________________________ users mailing
> > list
> > >>> users@lists.open-mpi.org
> > >>> https://lists.open-mpi.org/mailman/listinfo/users
> > >> _______________________________________________ users mailing
> list
> > >> users@lists.open-mpi.org
> > >> https://lists.open-mpi.org/mailman/listinfo/users
> > > _______________________________________________ users mailing list
> > > users@lists.open-mpi.org
> > > https://lists.open-mpi.org/mailman/listinfo/users
> > >
> >
> > - --
> >  ____
> >  || \\UTGERS,     |----------------------*O*------------------------
> >  ||_// the State  |    Ryan Novosielski - novos...@rutgers.edu
> >  || \\ University | Sr. Technologist - 973/972.0922 ~*~ RBHS Campus
> >  ||  \\    of NJ  | Office of Advanced Res. Comp. - MSB C630, Newark
> >       `'
> > -----BEGIN PGP SIGNATURE-----
> >
> >
> iF0EARECAB0WIQST3OUUqPn4dxGCSm6Zv6Bp0RyxvgUCXGkdJQAKCRCZv6Bp
> > 0Ryx
> >
> vvO3AKChC0/SZ74xeY95WjYEgFhVz+bXlACfYZWEKe4ZDbbbafGAcCuMF04yIgs
> > =
> > =6QM1
> > -----END PGP SIGNATURE-----
> > _______________________________________________
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to