Patrick Geoffray wrote:

Eugene Loh wrote:

Possibly, you meant to ask how one does directed polling with a wildcard source MPI_ANY_SOURCE. If that was your question, the answer is we punt. We report failure to the ULP, which reverts to the standard code path.

Sorry, I meant ANY_SOURCE. If you poll only the queue that correspond to a posted receive, you only optimize micro-benchmarks, until they start using ANY_SOURCE.

Right.

So, does recvi() is a one-time shot ? Ie do you poll the right queue only once and if it fails then you fall back on polling all queues ?

You poll it "some". The BTL is granted some leeway in what "immediately" means.

If yes, then it's unobtrusive but I don't think it would help much.

Well, check the RFC.  The data shows huge improvements in HPCC latency.

If you poll the right queue many times, then you have to decide when to fall back on polling all queues, and it's not trivial.

It's not 100% satisfactory, but clearly OMPI (and every other MPI implementation and just about any major piece of HPC software) is trying to guess among all sorts of trade-offs. Many of those trade-offs are user tunable -- hence, those pages and pages compiler options (pick your favorite compiler), build flags, MCA parameters, etc.

How do you ensure you check all incoming queues from time to time to prevent flow control (specially if the queues are small for scaling) ?

There are a variety of choices here. Further, I'm afraid we ultimately have to expose some of those choices to the user (MCA parameters or something).

In the vast majority of cases, users don't know how to turn the knobs.

Totally agree. Exposing these choices to the users is ugly and expecting users to make such choices is ridiculous. Though, for what it's worth:

% ompi_info -a | wc -l
1037
%

I actually agree with you a lot. I do think that my RFC represents one step forward. I'll see how quickly I can prototype and characterize a single-queue solution so we can judge alternatives more diligently.

Reply via email to