On Sep 27, 2012, at 7:22 PM, Sébastien Boisvert wrote: > Without the virtual message router, I get messages like these: > > [cp2558][[30209,1],0][connect/btl_openib_connect_oob.c:490:qp_create_one] > error creating qp errno says Cannot allocate memory
You're running out of registered memory. Check out these FAQ items: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages http://www.open-mpi.org/faq/?category=openfabrics#ib-receive-queues The second one tells you how to change your receive queue types; Open MPI defaults to 1 per-peer receive queue and several shared receive queues. You might want to change to all shared receive queues. > The real message tag, the real source and the real destination are stored > in the MPI tag. I know, this is ugly, but it works. I can not store this > information in the message buffer because the buffer can be NULL. > > bits 0 to 7: tag (8 bits, values from 0 to 255, 256 possible values) > bits 8 to 19: true source (12 bits, values from 0 to 4095, 4096 possible > values) > bits 20 to 31: true destination (12 bits, values from 0 to 4095, 4096 > possible values) > > Without the virtual router, my code is compliant with the fact that > MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_TAG_UB,...) is at least 32767 (my tags > are <= 255). > > When I try jobs with 4096 processes with the virtual message router, I get > the error: > > MPI_ERR_TAG: invalid tag. > > Without the virtual message router I get: > > [cp2558][[30209,1],0][connect/btl_openib_connect_oob.c:490:qp_create_one] > error creating qp errno says Cannot allocate memory > > With Open-MPI 1.5.4, the upper bound is 17438272 (at least in our build). > That explains MPI_ERR_TAG. +1 on what Hristo said -- remember that you get a pointer to an MPI_Aint. So you need to dereference it to get the value back. > My 2 questions: > > 1. Is there a better way to store routing information ? Seems fine to me. Just stay <=INT_MAX and you should be fine. > 2. Can I create my own communicator and set its MPI_TAG_UB to whatever I want > ? As Hristo said, no. It's a limit in Open MPI. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/