Hello, I am running Ray (a distributed software in genomics) with Open-MPI on 2048 processes and everything runs fine. Ray has a any-to-any communication pattern. To avoid using too much memory, I implemented a virtual message router.
Without the virtual message router, I get messages like these: [cp2558][[30209,1],0][connect/btl_openib_connect_oob.c:490:qp_create_one] error creating qp errno says Cannot allocate memory We did some tests on the Cray XE6 on 4096 processing elements (4096 MPI ranks) without the virtual message router and everything runs fine as is. So using the virtual message router is not required. The real message tag, the real source and the real destination are stored in the MPI tag. I know, this is ugly, but it works. I can not store this information in the message buffer because the buffer can be NULL. bits 0 to 7: tag (8 bits, values from 0 to 255, 256 possible values) bits 8 to 19: true source (12 bits, values from 0 to 4095, 4096 possible values) bits 20 to 31: true destination (12 bits, values from 0 to 4095, 4096 possible values) Without the virtual router, my code is compliant with the fact that MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_TAG_UB,...) is at least 32767 (my tags are <= 255). When I try jobs with 4096 processes with the virtual message router, I get the error: MPI_ERR_TAG: invalid tag. Without the virtual message router I get: [cp2558][[30209,1],0][connect/btl_openib_connect_oob.c:490:qp_create_one] error creating qp errno says Cannot allocate memory With Open-MPI 1.5.4, the upper bound is 17438272 (at least in our build). That explains MPI_ERR_TAG. My 2 questions: 1. Is there a better way to store routing information ? 2. Can I create my own communicator and set its MPI_TAG_UB to whatever I want ? Thanks ! *** Sébastien Boisvert Ph.D. student http://boisvert.info/