Re: Status of ummunot branch?

2013-06-14 Thread Jeff Squyres (jsquyres)
On Jun 12, 2013, at 5:17 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: Yes, it can, via MAP_FIXED. There are lots of fun tricks you can play using that. You're missing the point. Normal users (i.e., MPI users) don't do that. They call malloc() and they get what they get. The

Re: Status of ummunot branch?

2013-06-14 Thread Jason Gunthorpe
On Fri, Jun 14, 2013 at 10:53:24PM +, Jeff Squyres (jsquyres) wrote: On Jun 12, 2013, at 5:47 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: Someone has to finish the ummunotify rewrite Roland started. Realistically MPI is going to be the only user, can someone from the

Re: Status of ummunot branch?

2013-06-12 Thread Jeff Squyres (jsquyres)
On Jun 10, 2013, at 11:56 AM, Liran Liss lir...@mellanox.com wrote: Register all address space is the moral equivalent of not having userspace registration, so let's talk about it in those terms. Specifically, there's a subtle difference between: a) telling verbs to register (0...2^64)

Re: Status of ummunot branch?

2013-06-12 Thread Jason Gunthorpe
On Wed, Jun 12, 2013 at 09:10:57PM +, Jeff Squyres (jsquyres) wrote: Another way to look at it is specify IO access permissions for address space ranges. This could be useful to implement a buffer pool to be used for a specific MR only, yet still map/unmap memory within this pool on

Re: Status of ummunot branch?

2013-06-12 Thread Jeff Squyres (jsquyres)
On Jun 10, 2013, at 1:26 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: I agree that pushing all registration issues out of the application and (somewhere) into the verbs stack would be a nice solution. Well, it creates a mess in another sense, because now you've lost context.

Re: Status of ummunot branch?

2013-06-12 Thread Jason Gunthorpe
On Wed, Jun 12, 2013 at 09:18:34PM +, Jeff Squyres (jsquyres) wrote: Well, it creates a mess in another sense, because now you've lost context. When your MPI goes to do a 1byte send the kernel may well prefetch a few megabytes of page tables, whereas an implementation in userspace

RE: Status of ummunot branch?

2013-06-10 Thread Liran Liss
; linux-rdma@vger.kernel.org; Shachar Raindel Subject: Re: Status of ummunot branch? On Fri, Jun 07, 2013 at 10:59:43PM +, Jeff Squyres (jsquyres) wrote: I don't think this covers other memory regions, like those added via mmap, right? We talked about this at the MPI Forum this week

Re: Status of ummunot branch?

2013-06-10 Thread Jeff Squyres (jsquyres)
On Jun 7, 2013, at 4:57 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: We talked about this at the MPI Forum this week; it doesn't seem like ODP fixes any MPI problems. ODP without 'register all address space' changes the nature of the problem, and fixes only one problem. I

RE: Status of ummunot branch?

2013-06-10 Thread Liran Liss
: Status of ummunot branch? On Jun 7, 2013, at 4:57 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: We talked about this at the MPI Forum this week; it doesn't seem like ODP fixes any MPI problems. ODP without 'register all address space' changes the nature of the problem

Re: Status of ummunot branch?

2013-06-10 Thread Jason Gunthorpe
On Mon, Jun 10, 2013 at 02:49:24PM +, Jeff Squyres (jsquyres) wrote: On Jun 7, 2013, at 4:57 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: We talked about this at the MPI Forum this week; it doesn't seem like ODP fixes any MPI problems. ODP without 'register all

Re: Status of ummunot branch?

2013-06-07 Thread Jeff Squyres (jsquyres)
On Jun 6, 2013, at 4:33 PM, Jeff Squyres (jsquyres) jsquy...@cisco.com wrote: I don't think this covers other memory regions, like those added via mmap, right? We talked about this at the MPI Forum this week; it doesn't seem like ODP fixes any MPI problems. 1. MPI still has to have a

Re: Status of ummunot branch?

2013-06-07 Thread Jason Gunthorpe
On Fri, Jun 07, 2013 at 10:59:43PM +, Jeff Squyres (jsquyres) wrote: I don't think this covers other memory regions, like those added via mmap, right? We talked about this at the MPI Forum this week; it doesn't seem like ODP fixes any MPI problems. ODP without 'register all address

Re: Status of ummunot branch?

2013-06-06 Thread Jeff Squyres (jsquyres)
On Jun 5, 2013, at 10:52 PM, Haggai Eran hagg...@mellanox.com wrote: Haggai: A verb to resize a registration would probably be a helpful step. MPI could maintain one registration that covers the sbrk region and one registration that covers the heap, much easier than searching tables and

Re: Status of ummunot branch?

2013-06-05 Thread Haggai Eran
On 04/06/2013 20:04, Jason Gunthorpe wrote: Thus, I assume, on-demand allows pages that are 'absent' in the larger page table to generate faults to the CPU? Yes, that's correct. So how does lifetime work here? - Can you populate the larger page table as soon as registration happens,

Re: Status of ummunot branch?

2013-06-05 Thread Haggai Eran
On 04/06/2013 23:13, Jeff Squyres (jsquyres) wrote: On Jun 4, 2013, at 4:50 AM, Haggai Eran hagg...@mellanox.com wrote: Does this mean that an MPI implementation still has to register memory upon usage, and maintain its own registered memory cache? Yes. However, since registration doesn't

Re: Status of ummunot branch?

2013-06-05 Thread Jeff Squyres (jsquyres)
On Jun 5, 2013, at 12:14 AM, Haggai Eran hagg...@mellanox.com wrote: Hmm; I'm confused. How does this fix the MPI-needs-to-intercept-freed-memory problem? Well, there is no problem if an application frees registered memory (in an on-demand paging memory region) and that memory is returned

Re: Status of ummunot branch?

2013-06-05 Thread Haggai Eran
On 05/06/2013 15:45, Jeff Squyres (jsquyres) wrote: On Jun 5, 2013, at 12:14 AM, Haggai Eran hagg...@mellanox.com wrote: Hmm; I'm confused. How does this fix the MPI-needs-to-intercept-freed-memory problem? Well, there is no problem if an application frees registered memory (in an

Re: Status of ummunot branch?

2013-06-05 Thread Jeff Squyres (jsquyres)
On Jun 5, 2013, at 6:39 AM, Haggai Eran hagg...@mellanox.com wrote: Perhaps I'm missing something, but I believe ODP deals with the first two problems in the list (slide 8), even if it doesn't solve them completely. Unfortunately, it does not. If we could register(0 ... 2^64) and never have

Re: Status of ummunot branch?

2013-06-05 Thread Jason Gunthorpe
On Wed, Jun 05, 2013 at 04:53:48PM +, Jeff Squyres (jsquyres) wrote: On Jun 5, 2013, at 6:39 AM, Haggai Eran hagg...@mellanox.com wrote: Perhaps I'm missing something, but I believe ODP deals with the first two problems in the list (slide 8), even if it doesn't solve them completely.

Re: Status of ummunot branch?

2013-06-05 Thread Jeff Squyres (jsquyres)
On Jun 5, 2013, at 10:14 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: a = malloc(x);// a gets (va=0x100, pa=0x12345) back from malloc MPI_Send(a, ...); // MPI registers 0x100 for len=x, and saves (0x100,x) in reg cache free(a); a = malloc(x);// a gets (va=0x100,

Re: Status of ummunot branch?

2013-06-05 Thread Jason Gunthorpe
On Wed, Jun 05, 2013 at 06:10:11PM +, Jeff Squyres (jsquyres) wrote: On Jun 5, 2013, at 10:14 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: a = malloc(x);// a gets (va=0x100, pa=0x12345) back from malloc MPI_Send(a, ...); // MPI registers 0x100 for len=x, and saves

Re: Status of ummunot branch?

2013-06-05 Thread Jeff Squyres (jsquyres)
On Jun 5, 2013, at 11:18 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: Are you saying that the 2nd malloc will magically be registered (with the new physical address)? Yes, that is the whole point. Interesting. ODP fundamentally fixes the *bug* where the HCA's view of

Re: Status of ummunot branch?

2013-06-05 Thread Jason Gunthorpe
On Wed, Jun 05, 2013 at 06:45:13PM +, Jeff Squyres (jsquyres) wrote: Hum. I was under the impression that with today's code (i.e., not ODP), if you a = malloc(N); ibv_reg_mr(..., a, N, ...); free(a); (assuming that the memory actually left the process at free) Then the relevant

Re: Status of ummunot branch?

2013-06-05 Thread Jeff Squyres (jsquyres)
On Jun 5, 2013, at 12:05 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: It does seem quite odd, abstractly speaking, that a registration would survive a free/re-malloc (which is arguably a different buffer). Not at all: the purpose of the registration is to allow access via

Re: Status of ummunot branch?

2013-06-05 Thread Haggai Eran
On 05/06/2013 22:05, Jason Gunthorpe wrote: On Wed, Jun 05, 2013 at 06:45:13PM +, Jeff Squyres (jsquyres) wrote: What happens if you: a = malloc(N * page_size); ibv_reg_mr(..., a, N * page_size, ...); free(a); // incoming RDMA arrives targeted at buffer a Haggai should comment on

Re: Status of ummunot branch?

2013-06-04 Thread Or Gerlitz
On 04/06/2013 04:24, Jeff Squyres (jsquyres) wrote: On May 29, 2013, at 1:53 AM, Or Gerlitz or.gerl...@gmail.com wrote: Have you looked on ODP? see https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html Is the

Re: Status of ummunot branch?

2013-06-04 Thread Haggai Eran
On 04/06/2013 11:37, Or Gerlitz wrote: On 04/06/2013 04:24, Jeff Squyres (jsquyres) wrote: On May 29, 2013, at 1:53 AM, Or Gerlitz or.gerl...@gmail.com wrote: Have you looked on ODP? see

Re: Status of ummunot branch?

2013-06-04 Thread Jeff Squyres (jsquyres)
On Jun 4, 2013, at 2:54 AM, Haggai Eran hagg...@mellanox.com wrote: We wish to get there eventually. In our current implementation you still have to register an on-demand memory region explicitly. The difference between a regular memory region is that the pages in the region aren't pinned.

Re: Status of ummunot branch?

2013-06-04 Thread Haggai Eran
On 04/06/2013 13:56, Jeff Squyres (jsquyres) wrote: On Jun 4, 2013, at 2:54 AM, Haggai Eran hagg...@mellanox.com wrote: We wish to get there eventually. In our current implementation you still have to register an on-demand memory region explicitly. The difference between a regular memory

Re: Status of ummunot branch?

2013-06-04 Thread Jason Gunthorpe
On Tue, Jun 04, 2013 at 02:50:33PM +0300, Haggai Eran wrote: Our HCAs use their own page tables, in addition to a TLB cache. A miss in the TLB cache that can be filled from the HCA's page tables will not cause an RNR NAK, since the HCA can fill it relatively fast without the help of the

Re: Status of ummunot branch?

2013-06-04 Thread Jeff Squyres (jsquyres)
On Jun 4, 2013, at 4:50 AM, Haggai Eran hagg...@mellanox.com wrote: Does this mean that an MPI implementation still has to register memory upon usage, and maintain its own registered memory cache? Yes. However, since registration doesn't pin memory, you can leave registered memory regions in

Re: Status of ummunot branch?

2013-06-03 Thread Jeff Squyres (jsquyres)
On May 29, 2013, at 1:53 AM, Or Gerlitz or.gerl...@gmail.com wrote: Have you looked on ODP? see https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html Is the idea behind ODP that, at the beginning of time, you

Re: Status of ummunot branch?

2013-05-30 Thread Jeff Squyres (jsquyres)
On May 30, 2013, at 1:09 AM, Or Gerlitz ogerl...@mellanox.com wrote: Has this been run by the MPI implementor community? The team that works on this here isn't ready for submission, so community runs were not made yet If this is a solution to an MPI problem, it would seem like a good idea

Re: Status of ummunot branch?

2013-05-29 Thread Or Gerlitz
On Tue, May 28, 2013 at 8:51 PM, Jeff Squyres (jsquyres) jsquy...@cisco.com wrote: I ask because, as an MPI guy, I would *love* to see this stuff integrated into the kernel and libibverbs. Hi Jeff, Have you looked on ODP? see

Re: Status of ummunot branch?

2013-05-29 Thread Jeff Squyres (jsquyres)
On May 29, 2013, at 4:53 AM, Or Gerlitz or.gerl...@gmail.com wrote: Have you looked on ODP? see https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html Is this upstream? Has this been run by the MPI implementor

Re: Status of ummunot branch?

2013-05-29 Thread Or Gerlitz
On 30/05/2013 01:56, Jeff Squyres (jsquyres) wrote: On May 29, 2013, at 4:53 AM, Or Gerlitz or.gerl...@gmail.com wrote: Have you looked on ODP? see https://www.openfabrics.org/resources/document-downloads/presentations/doc_download/568-on-demand-paging-for-user-space-networking.html Is this

Re: Status of ummunot branch?

2013-05-28 Thread Roland Dreier
On Tue, May 28, 2013 at 10:51 AM, Jeff Squyres (jsquyres) jsquy...@cisco.com wrote: I see a ummunot branch on your kernel tree at git.kernel.org (https://git.kernel.org/cgit/linux/kernel/git/roland/infiniband.git/log/?h=ummunot). Just curious -- what's the status of this tree? I ask because,

Re: Status of ummunot branch?

2013-05-28 Thread Jeff Squyres (jsquyres)
On May 28, 2013, at 1:52 PM, Roland Dreier rol...@purestorage.com wrote: Haven't touched it in quite a while except to keep it building. Needs work to finish up. What kinds of things still need to be done? (I don't know if we could work on this or not; just asking to scope out what would