On Fri, May 01, 2009 at 07:56:48AM -0400, Jeff Squyres wrote: > On Apr 30, 2009, at 6:22 PM, Jason Gunthorpe wrote: > > >After reading all the postings, I think my idea to fix the verbs API > >to not, essentially, corrupt an existing registration when the virtual > >address space changes is the best bet. This slightly changes the > >semantics of the verbs MR to refer to virtual address space within the > >process, not the underlying object(s) that happen to be mapped there > >when the registration is made. > I'm not sure how this helps MPI -- our registration caches will still > become invalid if the MPI app free()'s registered memory...?
No, they don't. The only reason you have a problem today is because the memory registration is tied to the underlying *object* not the virtual address. So when the app fiddles with things and changed the virtual address to object mapping it wrecks your caching. If instead the registration is tied to a virtual address, then it doesn't matter what the app does, that virtual address range will *always* point to the currently mapped objects. If the app does free() and then mallocs() without an intervining kernel call then it doesn't matter, your cache of registered VM addreses still says that it is available If the app does free() resulting in munmap and then malloc() resulting in mmap() and re-uses the same address then, again, it doesn't matter to you because the VM address is still registered by the kernel and is switched to the new mmap(). The only problem is over time your cache will have registions of VM that are not in use by the app, or don't have backing objects any longer. This is not a correctness problem, but it might be a performance problem. > MPI maintains a registration cache because registration is so > expensive. Even if the registration cache becomes "safely" invalid > (e.g., you'll never get a scenario where one virtual address could > have previously pointed to a different hardware address within the > span of one process), it doesn't help. How so? That would seem to close the data corruption hole entirely. Sure you still have to call registration functions but one step at a time :) > Ok, I'll back off slightly: if you want verbs to go mainstream, there > will be many other ULPs / middleware libraries that have memory models > like MPI's (that the upper layer is responsible for allocating/freeing > message buffers). Put differently: the TCP/sockets stack doesn't have > this restriction; it will be extremely difficult to convert legions of > sockets programmers to verbs if you effectively restrict large > messages to only be allocated/freed by the network layer (kinda > defeats the point of RDMA if you have to copy large messages, right?). Fair enough - but the registration model is pretty much an inevitable consequence of kernel bypass. If you really want to get rid of it then you need to have an operating mode where the WRs are generated by the kernel through syscalls like all the other network stacks. I've not seen any notion of how to seperate the two ideas at least.. Jason _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
