On Thu, May 29, 2008 at 04:24:18PM -0300, Davi Vercillo C. Garcia wrote: > Hi, > > I'm trying to run my program in my environment and some problems are > happening. My environment is based on PVFS2 over NFS (PVFS is mounted > over NFS partition), OpenMPI and Ubuntu. My program uses MPI-IO and > BZ2 development libraries. When I tried to run, a message appears: > > File locking failed in ADIOI_Set_lock. If the file system is NFS, you > need to use NFS version 3, ensure that the lockd daemon is running on > all the machines, and mount the directory with the 'noac' option (no > attribute caching). > [campogrande05.dcc.ufrj.br:05005] MPI_ABORT invoked on rank 0 in > communicator MPI_COMM_WORLD with errorcode 1 > mpiexec noticed that job rank 1 with PID 5008 on node campogrande04 > exited on signal 15 (Terminated).
Hi. NFS has some pretty sloppy consistency semantics. If you want parallel I/O to NFS you have to turn off some caches (the 'noac' option in your error message) and work pretty hard to flush client-side caches (which ROMIO does for you using fcntl locks). If you do this, note that your performance will be really bad, but you'll get correct results. Your nfs-exported PVFS volumes will give you pretty decent serial i/o performance since NFS caching only helps in that case. I'd suggest, though, that you try using straight PVFS for your MPI-IO application, as long as the parallell clients have access to all of the pvfs servers (if tools like pvfs2-ping and pvfs2-ls work, then you do). You'll get better performance for a variety of reasons and can continue to keep your NFS-exported PVFS volumes up at the same time. Oh, I see you want to use ordered i/o in your application. PVFS doesn't support that mode. However, since you know how much data each process wants to write, a combination of MPI_Scan (to compute each processes offset) and MPI_File_write_at_all (to carry out the collective i/o) will give you the same result with likely better performance (and has the nice side effect of working with pvfs). ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B