Dear Steven, dear meep users! We upgraded our meep installation from 0.20.3 (from debian package) to 1.0.3 (self compiled with mpich and parallel HDF5, all dependencies from debian package system).
We are working on a intel cluster with several nodes. All simulation data is on a file server and the directories are mounted via NFS to all nodes. If we start a meep-mpi (1.0.3) simulation at one of these nodes on one NFS-mounted directory, the simulation hangs up as soon as file output (output to a HDF5-file in this case) is started. If we disable file output the simulation works well. If we copy the ctl file to a local directory the simulation works well too, even with file output. We also tried to use ssh-mounted filsystems instead of NFS but with the same result. If we compile meep-mpi with serial HDF5, it works for ~50% of the simulations. But many calculations are aborted at an arbitrary time-step with following error message: HDF5-DIAG: Error detected in HDF5 library version: 1.6.6 thread 3062614576. Back trace follows. #000: ../../../src/H5F.c line 2049 in H5Fopen(): unable to open file major(04): File interface minor(17): Unable to open file #001: ../../../src/H5F.c line 1829 in H5F_open(): unable to read superblock major(04): File interface minor(24): Read failed #002: ../../../src/H5Fsuper.c line 312 in H5F_read_superblock(): truncated file major(04): File interface minor(21): File has been truncated HDF5-DIAG: Error detected in HDF5 library version: 1.6.6 thread 3062614576. Back trace follows. #000: ../../../src/H5D.c line 1163 in H5Dopen(): not a location major(01): Function arguments minor(03): Inappropriate type #001: ../../../src/H5G.c line 1928 in H5G_loc(): invalid object ID major(01): Function arguments minor(05): Bad value HDF5-DIAG: Error detected in HDF5 library version: 1.6.6 thread 3062614576. Back trace follows. #000: ../../../src/H5D.c line 1266 in H5Dget_space(): not a dataset major(01): Function arguments minor(03): Inappropriate type HDF5-DIAG: Error detected in HDF5 library version: 1.6.6 thread 3062614576. Back trace follows. #000: ../../../src/H5S.c line 856 in H5Sget_simple_extent_ndims(): not a data space major(01): Function arguments minor(03): Inappropriate type meep: error on line 548 of h5file.cpp: file data is inconsistent rank for subsequent extend_data [0] MPI Abort by user Aborting program ! [0] Aborting program! p0_31944: p4_error: : 1 It seems that the output of field slices with the in-volume command causes the problem. Does anyone has seen this before? How do you access remote file systems if running simulations on several computers? Thanks and best regards, Roman & Paul -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 _______________________________________________ meep-discuss mailing list meep-discuss@ab-initio.mit.edu http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/meep-discuss