Re: [ofa-general] New proposal for memory management

Jeff Squyres Mon, 04 May 2009 17:25:35 -0700

I think that this thread has gotten to the point where people are nolonger reading each post carefully and are therefore re-hashing pointsthat have already been discussed. It has therefore reached the end ofits usefulness.

It was suggested today that a teleconference to discuss these issuesmight be much more useful (an hour-long teleconference can save aweek's worth of emails!). This will be a technical call to discussmemory registration issues; it will not be an EWG call. I've setup aWebEx call for next Monday at the "normal" time: noon US Eastern, 9amUS Pacific, 7pm Israel. The invite will be coming to the ewg andgeneral lists shortly.

*** PLEASE USE THE WEBEX URL TO JOIN THE TELECONFERENCE (vs. justdialing in)(when you logon, it'll prompt you for a phone number to call youback;

    yes, non-US phone numbers are supported)

I will make up a small number of slides that attempt to summarize allthe arguments (on both sides) so far. Hopefully, they can serve as astarting point for discussion.


Thanks; see you next Monday.




On May 1, 2009, at 1:09 PM, Roland Dreier (rdreier) wrote:

 > You mentioned that doing this stuff is a choice; the choice that
 > MPI's/ ULPs/applications therefore have is:
 >
 > - don't use registration caches/memory allocation hooking, have
 > terrible performance
 > - use registration caches/memory allocation hooking, have good
 > performance

I think it's a bit of a stretch to suggest that all or even most

userspace RDMA applications have the same need for registrationcaching

as MPI.  In fact my feeling is that the fact that MPI must deal with
RDMA to arbitrary memory allocated by an application out of MPI's
control is the exception.  My most recent experience was with Cisco's

RAB library, and in that case we simply designed the library so thatall

RDMA was done to memory allocated by the library -- so no need for a
registration cache, and in fact no need for registration in any fast

path. I suspect that the majority of code written to use RDMAnatively

will be designed with similar properties.

So this proposal is very much an MPI-specific interface. Whichleads to

my next point.  I have no doubt that the MPI community has a very good
idea of a memory registration interface that would make MPI

implementations simpler and more robust. However I don't thinkthere's

quite as much expertise about what the best way to implement such an
interface is.

My initial reaction is that I don't want to extend the kernel ABI with

a set of new MPI-specific verbs if there's a way around it. We'vebeen

told over and over that the registration cache is complex and fragile
code -- but moving complex and fragile code into the kernel doesn't
magically make it any simpler or more robust, it just means that bugs
now crash the whole system instead of just affecting one process.

Now, of course MMU notifiers allow the kernel to know reliably when a
process's page tables change, which means that all the complicated

malloc hooking etc is not needed. So that complexity is avoided inthe

kernel.  But suppose I give userspace the same MMU notifier capability
(eg I add a system call like "if any mappings in the virtual address

range X ... Y change, then write a 1 to virtual address Z") -- thenwhat

do I gain from having the rest of the registration caching in the

kernel? (And avoiding the duplication of caching code betweenmultiple

MPI implementations is not an answer -- it's quite feasible to put the
caching code into libibverbs if that's the best place for it)

 - R.



--
Jeff Squyres
Cisco Systems

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [ofa-general] New proposal for memory management

Reply via email to