On Apr 30, 2009, at 7:20 PM, Aaron Fabbri (aafabbri) wrote:

Yes, MPI_ALLOC_MEM / MPI_FREE_MEM calls have been around for
a long time (~10 years?).  Using them does avoid many of the
problems that have been discussed.  Most (all?) MPI's either
support ALLOC_MEM / FREE_MEM by registering at allocation
time and unregistering at free time, or some variation of that.

Ah. Are there any problems that are not addressed by having MPI own allocation of network bufs?

Sure, there's lots of them. :-) But this thread is just about the memory allocation management issues.

(BTW registering for each allocation could be improved, I think.)

Probably so. Since so few MPI applications use these calls, OMPI hasn't really bothered to tune them.

But unfortunately, very few MPI apps use these calls; they use
malloc() and friends instead.  Or they're written in Fortran,
where such concepts are not easily mapped (don't
underestimate how much Fortran MPI code runs on verbs!).
Indeed, in some layered scenarios, it's not easy to use these
calls (e.g., if an MPI-enabled computational library may
re-use user-provided buffers because they're so large, etc.).

I understand the difficulty.  A couple possible counterpoints:

1. Make the next version of MPI spec *require* using the mpi_alloc
atuff.

The MPI Forum (the standards body) has been very resistant to this, especially based on the requirements of one not-pervasive network stack. It would effectively break all legacy MPI applications, too. I seriously doubt that the Forum would go for that.

FWIW: the way the MPI spec is worded, it says that you *may* get performance benefit from using MPI_ALLOC_MEM. E.g., an MPI can always support using malloc buffers -- just copy into network-special buffers. The performance would be terrible :-), but it would be correct.

2. MPI already requires recompilation of apps, right?  I don't know
fortran, or what it uses for allocation, but worse case, maybe you could
change the standard libraries or compilers.

We tried that -- interposing our own copies of malloc, free, mmap, ... etc. (e.g., inside libmpi). Ick. Horrible, horrible ick. And it definitely breaks some real-world apps and memory-checking debuggers/ tools.

3. Rip out your registration cache.  Make malloc'd buffers go really
slow (register in fast path) and mpi_alloc_mem() buffers go really fast. People will migrate. The hard part of this would be getting all MPIs to
agree on this, I'm guessing.


See http://lists.openfabrics.org/pipermail/general/2009-May/ 059376.html -- Open MPI effectively tried this and got beat up by a) competing MPI's, and b) the marketing supporting Open MPI. :-\

People won't migrate, nor will main-line MPI benchmarks. Customers want top performance out-of-the-box with their MPI (which is not unreasonable). Users have used malloc() for 10+ years, and other networks don't require the use of MPI_ALLOC_MEM.

--
Jeff Squyres
Cisco Systems

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to