Hello, I run into a stange problem with qlogic OFED and openmpi. When i submit (through SGE) 2 jobs on the same node, the second job ends up with:
(ipath/PSM)[10292]: can't open /dev/ipath, network down (err=26) I'm pretty sure the infiniband is working well as the other job runs fine. Here is details about the configuration: Qlogic HCA: InfiniPath_QMH7342 (2 ports but only one connected to a switch) qlogic_ofed-1.5.3-7.0.0.0.35 (rocks cluster roll) openmpi 1.5.4 (./configure --with-psm --with-openib --with-sge) ------------- In order to fix this problem i recompiled openmpi without psm support, but i faced an other problem: The OpenFabrics (openib) BTL failed to initialize while trying to allocate some locked memory. This typically can indicate that the memlock limits are set too low. For most HPC installations, the memlock limits should be set to "unlimited". The failure occured here: Local host: compute-0-6.local OMPI source: btl_openib.c:329 Function: ibv_create_srq() Device: qib0 Memlock limit: *unlimited*