HI, We are getting some strange behavior out of pvfs-2.8.1 clients running on some sles 10 sp 1 nodes.
The pvfs2 clients can mount the pvfs2 file system with no problems we then start an MPI job that runs on a small number of nodes. The problem happens when we try to kill the mpi job. As soon as we send the kill signal to the mpi job several of our pvfs2 client nodes have their pvfs2-client-core deamon die with this message: hpcp6671:~ # ps -ef |grep pvfs root 25767 1 0 12:21 ? 00:00:00 /bphpc5/vol0/salmr0/opt/pvfs-2.8.1/x86_64/sles10sp1/sbin/pvfs2-client -p /bphpc5/vol0/salmr0/opt/pvfs-2.8.1/x86_64/sles10sp1/sbin/pvfs2-client-core root 16117 25767 0 15:02 ? 00:00:00 [pvfs2-client-co] hpcp6671:~ # cat /tmp/pvfs2-client.log [E 12:21:35.567169] PVFS Client Daemon Started. Version 2.8.1 [D 12:21:35.567434] [INFO]: Mapping pointer 0x2acdf7aa3000 for I/O. [D 12:21:35.579256] [INFO]: Mapping pointer 0x2acdf8ea5000 for I/O. [E 15:02:54.988860] PVFS2 client: signal 11, faulty address is 0x41d5, from 0x408d81 [E 15:02:54.989282] [bt] pvfs2-client-core [0x408d81] [E 15:02:54.989294] [bt] pvfs2-client-core [0x408d81] [E 15:02:54.989302] [bt] pvfs2-client-core(main+0xbc3) [0x40a173] [E 15:02:54.989309] [bt] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2acdf788b154] [E 15:02:54.989315] [bt] pvfs2-client-core [0x403519] [E 15:02:54.991351] Child process with pid 25768 was killed by an uncaught signal 6 [E 15:02:54.993980] PVFS Client Daemon Started. Version 2.8.1 [D 15:02:54.994242] [INFO]: Mapping pointer 0x2b94619a2000 for I/O. [D 15:02:55.008318] [INFO]: Mapping pointer 0x2b9462da4000 for I/O. [E 15:02:55.312456] Got an unrecognized/unimplemented vfs operation of type ff000000. [E 15:02:55.312497] Post of op: PVFS_VFS_OP_INVALID failed! Any ideas? thanks Rene _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
