Good point Paul. I love XRC :-)
You may try to switch default configuration to XRC. --mca btl_openib_receive_queues X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32:X,65536,256,128,32 If XRC is not supported on your platform, ompi should report some nice message. BTW, on multi core system XRC should show better performance. Pavel (Pasha) Shamis --- Application Performance Tools Group Computer Science and Math Division Oak Ridge National Laboratory On Jan 27, 2011, at 8:19 PM, Paul H. Hargrove wrote: > Brian, > > As Pasha said: >> The maximum amount of supported qps you may see in ibv_devinfo. > > However you'll probably need "-v": > > {hargrove@cvrsvc05 ~}$ ibv_devinfo | grep max_qp: > {hargrove@cvrsvc05 ~}$ ibv_devinfo -v | grep max_qp: > max_qp: 261056 > > If you really are running out of QPs due to the "fattness" of the node, > then you should definitely look at enabling XRC if your HCA and > libibverbs version supports it. ibv_devinfo can query the HCA capability: > > {hargrove@cvrsvc05 ~}$ ibv_devinfo -v | grep port_cap_flags: > port_cap_flags: 0x02510868 > > and look for bit 0x00100000 ( == 1<<20). > > -Paul > > > > On 1/27/2011 5:09 PM, Barrett, Brian W wrote: >> Pasha - >> >> Is there a way to tell which of the two happened or to check the number of >> QPs available per node? The app likely does talk to a large number of peers >> from each process, and the nodes are fairly "fat" - it's quad socket, quad >> core and they are running 16 MPI ranks for each node. >> >> Brian >> >> On Jan 27, 2011, at 6:17 PM, Shamis, Pavel wrote: >> >>> Unfortunately verbose error reports are not so friendly...anyway , I may >>> think about 2 issues: >>> >>> 1. You trying to open open too much QPs. By default ib devices support >>> fairly large amount of QPs and it is quite hard to push it to this corner. >>> But If your job is really huge it may be the case. Or for example, if you >>> share the compute nodes with some other processes that create a lot of qps. >>> The maximum amount of supported qps you may see in ibv_devinfo. >>> >>> 2. The memory limit for registered memory is too low, as result driver >>> fails allocate and register memory for QP. This scenario is most common. >>> Just happened to me recently, system folks pushed some crap into >>> limits.conf. >>> >>> Regards, >>> >>> Pavel (Pasha) Shamis >>> --- >>> Application Performance Tools Group >>> Computer Science and Math Division >>> Oak Ridge National Laboratory >>> >>> >>> >>> >>> >>> >>> On Jan 27, 2011, at 5:56 PM, Barrett, Brian W wrote: >>> >>>> All - >>>> >>>> On one of our clusters, we're seeing the following on one of our >>>> applications, I believe using Open MPI 1.4.3: >>>> >>>> [xxx:27545] *** An error occurred in MPI_Scatterv >>>> [xxx:27545] *** on communicator MPI COMMUNICATOR 5 DUP FROM 4 >>>> [xxx:27545] *** MPI_ERR_OTHER: known error not in list >>>> [xxx:27545] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) >>>> [xxx][[31806,1],0][connect/btl_openib_connect_oob.c:857:qp_create_one] >>>> error creating qp errno says Resource temporarily unavailable >>>> -------------------------------------------------------------------------- >>>> mpirun has exited due to process rank 0 with PID 27545 on >>>> node rs1891 exiting without calling "finalize". This may >>>> have caused other processes in the application to be >>>> terminated by signals sent by mpirun (as reported here). >>>> -------------------------------------------------------------------------- >>>> >>>> >>>> The problem goes away if we modify the eager protocol msg sizes so that >>>> there are only two QPs necessary instead of the default 4. Is there a way >>>> to bump up the number of QPs that can be created on a node, assuming the >>>> issue is just running out of available QPs? If not, any other thoughts on >>>> working around the problem? >>>> >>>> Thanks, >>>> >>>> Brian >>>> >>>> -- >>>> Brian W. Barrett >>>> Dept. 1423: Scalable System Software >>>> Sandia National Laboratories >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> -- >> Brian W. Barrett >> Dept. 1423: Scalable System Software >> Sandia National Laboratories >> >> >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > HPC Research Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel