On 06/14/2016 10:38 AM, Andrew Ho wrote:
Hi Andrew,____
__ __
Since it’s going to be hard to replicate this, could you provide
more information by attaching a debugger and show the trace of where
the hang is?____
__ __
Thanks,____
Mohamad
Here's the trace I get:
#0 0x000000382f2dba48 in fcntl () from /lib64/libc.so.6
#1 0x00002aaab1138a58 in ADIOI_Set_lock () from mca_io_romio.so
#2 0x00002aaab111a7f2 in ADIOI_NFS_Fcntl () from mca_io_romio.so
#3 0x00002aaab1110b5a in mca_io_romio_dist_MPI_File_get_size () from
mca_io_romio.so
#4 0x00002aaab110ecd2 in mca_io_romio_file_get_size () from mca_io_romio.so
#5 0x00002aaaab22e184 in PMPI_File_get_size () from libmpi.so.12
#6 0x00002aaaaabe5a8f in H5FD_mpio_open () from libhdf5_debug.so.9.0.0
#7 0x00002aaaaabce8ec in H5FD_open () from libhdf5_debug.so.9.0.0
#8 0x00002aaaaabb4969 in H5F_open () from libhdf5_debug.so.9.0.0
#9 0x00002aaaaabacedd in H5Fcreate () from libhdf5_debug.so.9.0.0
#10 0x000000000040a09a in main () at main.cpp:70
GDB for some reason isn't giving me any line number information for
HDF5, so I found the lines these calls were happening at:
#6 H5FDmpio.c:1081
#7 H5FD.c:991
If you are stuck with nfs then I would recommend against using parallel
access to that file system. You only have one server anyway.
It's likely you are trying to develop on this system, then deploy
somewhere else, right? But there's no tuning that can eliminate the
file size check.
For this system you're probably better off without the MPI-IO transfer
property.
==rob
--
Andrew Ho
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5