Thanks Pasha for these details.

On Mon, 17 May 2010, Pavel Shamis (Pasha) wrote:

blocking is the receive queues, because they are created during MPI_Init, so in a way, they are the "basic fare" of MPI.
BTW SRQ resources are also allocated on demand. We start with very small SRQ and it is increased on SRQ limit event.
Ok. Understood. So maybe the increased memory is only due to CQs.

The XRC protocol seems to create shared receive queues, which is a good thing. However, comparing memory used by an "X" queue versus and "S" queue, we can see a large difference. Digging a bit into the code, we found some
So, do you see that X consumes more that S ? This is really odd.
Yes, but that's what we see. At least after MPI_Init.

strange things, like the completion queue size not being the same as "S" queues (the patch below would fix it, but the root of the problem may be elsewhere).

Is anyone able to comment on this ?
The fix looks ok, please submit it to trunk.
I don't have an account to do this, so I'll let maintainers push it into SVN.

BTW do you want to prepare the patch for send queue size factor ? It should be quite simple.
Maybe we can do this. However, we are a little playing with parameters and code without really knowing the deep consequences of what we do. Therefore, I would feel more confortable if someone who knows much on the openib btl confirms it's not breaking everything.

Sylvain

Reply via email to