I have a system with InifniPath HCAs, where I've historically tested mtl:psm. For some reason, that appears to have ceased working some time in the past 4 months. However, this report is about something else. I am using the current master tarball: openmpi-dev-1203-g171d674.tar.bz2
When I ran configure, verbs support was found even though it was not my intent to use it. So, I am running with an explicit blt list that omits verbs and am disabling the broken mtl:psm and mtl:ofi as well. However, I am getting complaints from some verbs-related code: $ mpirun -mca btl sm,self,tcp -mca mtl ^psm,ofi -np 2 -host n15,n16 examples/ring_c libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 -------------------------------------------------------------------------- Fork support was requested but the library call ibv_fork_init() failed. Hostname: n16 Error (22): Invalid argument -------------------------------------------------------------------------- -------------------------------------------------------------------------- Fork support was requested but the library call ibv_fork_init() failed. Hostname: n15 Error (22): Invalid argument -------------------------------------------------------------------------- -------------------------------------------------------------------------- Fork support was requested but the library call ibv_fork_init() failed. Hostname: n16 Error (22): Invalid argument -------------------------------------------------------------------------- -------------------------------------------------------------------------- Fork support was requested but the library call ibv_fork_init() failed. Hostname: n15 Error (22): Invalid argument -------------------------------------------------------------------------- Process 0 sending 10 to 1, tag 201 (2 processes in ring) Process 0 sent to 1 Process 0 decremented value: 9 Process 0 decremented value: 8 Process 0 decremented value: 7 Process 0 decremented value: 6 Process 0 decremented value: 5 Process 0 decremented value: 4 Process 0 decremented value: 3 Process 0 decremented value: 2 Process 0 decremented value: 1 Process 0 decremented value: 0 Process 0 exiting Process 1 exiting There are at least THREE things "wrong" in my opinion. The first is the following two lines: libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 However, I can run ibv_devinfo (and see ACTIVE ports) on both of the compute nodes. So, these appear to me to be a complaint about the login node (which is simply not on the IB network). I did not ask for ibv, and even if I did the message about a non-IB login node is just an annoyance. The second is the "ibv_fork_init()" message twice per host, again when I have NOT requested btl:verbs. The third is that I had to pass so many mca params just to get as far as this! I did find that adding "-mca oob tcp" eliminated all the messages. So, I am assuming oob:ud is responsible for this mess. This does not appear to be a very good default behavior. + I believe oob:ud should *silently* disqualify itself when the node running "mpirun" is not on the IB network. + I don't know why/when the ibv_fork_init() messages came about but they are quite annoying when I don't even intend to *use* ibv. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900