Hi Daniel, I don't see anything "wrong" with your HDF5 usage in the code that you sent, but since it's not the real thing here are a few issues you can do to get a better idea of what is going on. 1) make sure that collective operations are called collectively and in correct order on all procs: http://www.hdfgroup.org/HDF5/doc/RM/CollectiveCalls.html 2) Are you calling MPI_Finalize() before closing the HDF5 file (you shouldn't)? 3) Add an MPI_Barrier() before the h5fclose_f and see if all processes hit the barrier and get into the h5fclose_f. If they do not, then you need to review the collective requirements again in 1.
I would also suggest to upgrade to the latest HDF5 release. What MPI implementation and version are you using? It would also be really helpful to send a working program for us to replicate the problem, but I understand that it's not always a simple task to do. Thanks, Mohamad -----Original Message----- From: Hdf-forum [mailto:[email protected]] On Behalf Of Daniel Stokes Hagan Sent: Thursday, May 28, 2015 5:59 PM To: [email protected] Subject: [Hdf-forum] h5fclose_f hangs, or everything runs but produces segmentation faults afterwards Wasn't sure if I should have revived the following thread, but I *think* I'm having a different issue: http://hdf-forum.184993.n3.nabble.com/Fwd-H5Fclose-hangs-when-using-parallel-HDF5-td4027003.html I'm using hdf5-1.8.12 in my code on a PBS cluster. The code works on two different Mac OSX (2 and 8 core) machines. The code also runs on the cluster when I use 4 procs (1 node), or 16 procs (2 nodes, 8 procs/node). However, after it runs, and all the *.h5 data files are written without error, I get a segmentation fault. If I try to run with other node/proc configurations, like 32 procs (4 nodes and 8 proc/node), the application hangs up when h5fclose_f is called. I have included a truncated version of the subroutine that contains the call to h5fclose_f. Oh, and the code runs without error on any node/proc configuration when I do not use HDF5 output. Any insight or advice will be much appreciated! ========================================== integer(HID_T) :: file_id ! File identifier integer(HID_T) :: memspace ! File identifier integer(HID_T) :: dset_id ! Dataset identifier integer(HID_T) :: plist_id ! Property list identifier integer(HID_T) :: filespace ! Dataspace indentifier in file integer(HSIZE_T) :: array_dims(4) ! dimensions of the arrays integer :: error ! Error flag integer(HSIZE_T), dimension(2) :: chunk_dims integer(HSIZE_T), dimension(2) :: block integer(HSIZE_T), dimension(2) :: stride integer(HSIZE_T), dimension(2) :: count integer(HSSIZE_T), dimension(2) :: offset integer :: i,j,k,isc ! initialize FORTRAN interface and open data set and file call h5open_f(error) ! setup file access property list with parallel I/O access call h5pcreate_f(H5P_FILE_ACCESS_F,plist_id, error) call h5pset_fapl_mpio_f(plist_id, comm, MPI_INFO_NULL, error) ! create new h5 file collectively write(filen,'(i4.4)') nfile filename ='HDF5-2D/Field_'//filen//'.h5' nfile = nfile + 1 call h5fcreate_f(filename, H5F_ACC_TRUNC_F, file_id, error, access_prp = plist_id) call h5pclose_f(plist_id,error) ! Grid do i=1,nx xhdf(i) = xm(i+imin-1) end do do j=1,ny yhdf(j) = ym(j+jmin-1) end do array_dims(1) = Nx call h5ltmake_dataset_float_f(file_id, "X", 1, array_dims, xhdf(1:Nx), error) array_dims(1) = Ny call h5ltmake_dataset_float_f(file_id, "Y", 1, array_dims, yhdf(1:Ny), error) ! Lagrangian particle data array_dims(1) = npart chunk_dims(1) = npart_proc(irank) !npart_ if (irank.eq.1) then offset(1) = 0 else offset(1) = sum(npart_proc(1:(irank-1))) end if stride(1) = 1 count(1) = 1 block(1) = chunk_dims(1) ! Particle id call h5screate_simple_f(1, array_dims, filespace, error) call h5screate_simple_f(1, chunk_dims, memspace, error) call h5pcreate_f(H5P_DATASET_CREATE_F, plist_id, error) ! call h5pset_chunk_f(plist_id, 1, chunk_dims, error) call h5dcreate_f(file_id, "part_id", H5T_NATIVE_REAL, filespace, & dset_id, error, plist_id) call h5sclose_f(filespace, error) call h5dget_space_f(dset_id, filespace, error) call h5sselect_hyperslab_f (filespace, H5S_SELECT_SET_F, offset, count, error, & stride, block) call h5pcreate_f(H5P_DATASET_XFER_F,plist_id,error) call h5pset_dxpl_mpio_f(plist_id, H5FD_MPIO_INDEPENDENT_F,error) do i=1,npart_ bufferp(i) = real(part(i)%id,WP) end do call h5dwrite_f(dset_id, H5T_NATIVE_REAL, bufferp(:), array_dims, error, & file_space_id = filespace, mem_space_id = memspace, xfer_prp = plist_id) call h5sclose_f(filespace,error) call h5sclose_f(memspace,error) call h5dclose_f(dset_id,error) call h5pclose_f(plist_id,error) !!!! There are 7 more Lagrangian datasets in the full code ! Eulerian grid data array_dims(1) = Ny array_dims(2) = Nx chunk_dims(1) = ny_ chunk_dims(2) = nx_ stride(1) = 1 stride(2) = 1 count(1) = 1 count(2) = 1 block(1) = chunk_dims(1) block(2) = chunk_dims(2) offset(1) = jmin_ - nover - 1 offset(2) = imin_ - nover - 1 ! U call h5screate_simple_f(2, array_dims, filespace, error) call h5screate_simple_f(2, chunk_dims, memspace, error) call h5pcreate_f(H5P_DATASET_CREATE_F, plist_id, error) call h5pset_chunk_f(plist_id, 2, chunk_dims, error) call h5dcreate_f(file_id, "U", H5T_NATIVE_REAL, filespace, & dset_id, error, plist_id) call h5sclose_f(filespace, error) call h5dget_space_f(dset_id, filespace, error) call h5sselect_hyperslab_f (filespace, H5S_SELECT_SET_F, offset, count, error, & stride, block) call h5pcreate_f(H5P_DATASET_XFER_F,plist_id,error) call h5pset_dxpl_mpio_f(plist_id, H5FD_MPIO_INDEPENDENT_F,error) buffer3(:,:) = transpose(Ui(imin_:imax_,jmin_:jmax_,kmin_)) call h5dwrite_f(dset_id, H5T_NATIVE_REAL, buffer3(:,:), array_dims, error, & file_space_id = filespace, mem_space_id = memspace, xfer_prp = plist_id) call h5sclose_f(filespace,error) call h5sclose_f(memspace,error) call h5dclose_f(dset_id,error) call h5pclose_f(plist_id,error) !!!! There are (potentially) 9 more Eulerian datasets in the full code ! close the dataset, the file and the FORTRAN interface call h5fclose_f(file_id, error) call h5close_f(error) ========================================== _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5 _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
