On 05/06/2013 15:45, Jeff Squyres (jsquyres) wrote: > On Jun 5, 2013, at 12:14 AM, Haggai Eran <hagg...@mellanox.com> wrote: > >>> Hmm; I'm confused. How does this fix the >>> MPI-needs-to-intercept-freed-memory problem? >> Well, there is no problem if an application frees registered memory (in >> an on-demand paging memory region) and that memory is returned to the >> OS. The OS will invalidate these pages, and the HCA will no longer be >> able to use them. This means that the registration cache doesn't have to >> de-register memory immediately when it is freed. > > > (must... resist... urge... to... throw... furniture...) (ducking and taking cover :-) )
> > This is why features should not be introduced to solve MPI problems without > an understanding of what the MPI problems are. :-) Please go talk to the > Mellanox MPI team. > > Forgive me for being frustrated; memory registration and all the pain that it > entails was highlighted as ***the #1 problem*** by *5 major MPI > implementations* at the Sonoma 2009 workshop (see > https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/301-mpi-update-and-requirements-panel-all-presentations.html, > starting at slide 7 in the "openmpi" slide deck). Perhaps I'm missing something, but I believe ODP deals with the first two problems in the list (slide 8), even if it doesn't solve them completely. You no longer need to do dangerous tricks to catch free, munmap, sbrk. As I explained above, these operations can work on an ODP MR without allowing the HCA use the invalidated mappings. In the future we want to implement an implicit memory region covering the entire process address space, thus eliminating the need for memory registration almost completely (you might still want memory registration, or memory windows, in order to control permissions of remote operations). We can also allow fork to work with our implementation. Copy-on-write will work with ODP regions by invalidating the HCA's page tables before modifying the pages to be read-only. A page fault from the HCA can then refill the pages, or even break COW in case of a write. > Why don't we have something like ummunotify yet? I think that the problem we are trying to solve is better handled inside the kernel. If you are going to change the HCA's memory mappings, you'd have to go through the kernel anyway. Haggai -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html