[EMAIL PROTECTED] wrote on Thu, 17 Aug 2006 18:14 -0500:
> * BMI memory allocation.  Do we place any restrictions on when or how  
> frequently BMI_memalloc is called?  In the pvfs code, we always call  
> BMI_memalloc for a post_send or post_recv.  Would it be possible to  
> avoid the malloc on the client for a write and just use the user  
> buffer?  Or should we mandate that calls to post_send and post_recv  
> always pass in a pointer from BMI_memalloc?  (as a side note, if we  
> make that mandate, maybe we should have a BMI_buffer type that  
> memalloc returns and post_send/post_recv accept).

Both bmi_ib and bmi_gm define the BMI memalloc method to do
something other than simply malloc().  In the IB case, it pins the
memory early, and never unpins it until the corresponding
BMI_memfree() happens.  This is better than letting BMI do the
pinning explicitly, as it moves some of the messaging work out of
the critical path, if you can arrange to alloc/free before you do
send/recv.

Note that these alloc routines only do something special if the
buffer is big enough to be "worth it" (8 kB for IB).

There's no restrictions on how frequently you can call these things.
Each pinned memory region has some overhead in terms of in-pvfs data
structures, in-kernel data structers, and on-NIC data structures.
Ideally we'd try to limit the growth of these things and force old
entries to be freed, but in practice they mostly just grow and it's
not a big problem (unless you have lots of pvfs apps on a single
box, for instance).

You can certainly avoid the malloc and use the user buffer when you
have one instead.  I think this is the common case for MPI-IO
operations.  Point out what case you're talking about and I'll take
a look.

We definitely cannot mandate that all memory is BMI_memalloc-ed.
Arbirtary MPI_File_Write() and similar will pass in user buffers.
We don't want to copy them into BMI_memalloc-ed memory, and it's not
really practical to require that application writers use the MPI (or
PVFS) alloc routines.

If the bmi_buffer_type argument to the post_send and post_recv
routines is BMI_PRE_ALLOC, a BMI implementation can avoid pinning
the memory, as does GM.  For IB, it's just as fast to check the
address to see if it has already been pinned, either through
memalloc or implicitly by having been used as a user buffer.

                -- Pete

_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to