On Oct 28, 2011, at 1:59 AM, Eugene Loh wrote: > In our MTT testing, we see ibm/io/file_status_get_count fail occasionally > with: > > File locking failed in ADIOI_Set_lock(fd A,cmd F_SETLKW/7,type > F_RDLCK/0,whence 0) with return value > FFFFFFFF and errno 5. > - If the file system is NFS, you need to use NFS version 3, ensure that the > lockd daemon is running > on all the machines, and mount the directory with the 'noac' option (no > attribute caching). > - If the file system is LUSTRE, ensure that the directory is mounted with the > 'flock' option. > ADIOI_Set_lock:: Input/output error > ADIOI_Set_lock:offset 0, length 1 > > One of the curious things (to us) about this test is that no one else appears > to run it. Looking back through a lot of MTT results, essentially the only > results reported are Oracle. Almost no non-Oracle results for this test have > been reported in the last few months. Is there something special about this > test we should know about?
Not that I'm aware of. I see why Cisco skipped it -- I didn't have the "io" directory listed in my list of IBM directories to traverse. Doh! That's been fixed. (Cisco's MTT runs look like they need a bit of TLC -- I'm guessing IB is down on a node or two, resulting in a lot of false failures, but I likely won't have time to look at them until after SC :-( ) > P.S. We're also interested in understanding the error message better. I > suppose that's more appropriately taken up with ROMIO folks, which I will do, > but if anyone on this list has useful information I'd love to hear it. The > error apparently comes when MPI_File_get_size sets a lock. Each process has > its own file and the test usually passes, so it's unclear to me what the > problem is. Further, the error message discussing NFS and Lustre strikes me > as rather speculative. We tend to run these tests repeatedly on the same > file systems from the same test nodes. Anyone have any idea how sound the > NFSv3/lockd/noac advice is or what the real issue is here? No. You'll need to ask Rob Latham. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/