On May 18, 2009, at 2:02 PM, Caitlin Bestler wrote:

>> > Specifically: the actual dereg of 0x1000-0x3fff is blocked on also
>>  > releasing 0x2000-0x2fff.
>>
>> If everyone is doing this, how do you handle the case that Jason pointed
>> out, namely:
>>
>>  * you register 0x1000 ... 0x3fff
>>  * you want to register 0x2000 ... 0x2fff and have a cache hit
>>  * you finish up with 0x1000 ... 0x3fff
>> * app does something (which is valid since you finished up with the >> bigger range) that invalidates mapping 0x3000 ... 0x3fff (eg free()
>>   that leads to munmap() or whatever), and your hooks tell you so.
>>  * app reallocates a mapping in 0x3000 ... 0x3fff
>> * you want to re-register 0x1000 ... 0x3fff -- but it has to be marked
>>   both invalid and in-use in the cache at this point !?


I think I mis-parsed the above scenario in my previous response.

When our memory hooks tell us that memory is about to be removed from the process, we unregister all pages in the relevant region and remove those entries from the cache. So the next time you look in the cache for 0x3000-0x3fff, it won't be there -- it'll be treated as cache-cold.

How does 0x1000 to 0x3fff get registered as a single Memory Region?
If it is legitimate to free() 0x3000..0x3fff then how can there ever be a legitimate reference to 0x1000..0x3fff? If there is no such single reference,
I don't see how a Memory Region is every created covering that range.

If the user creates the Memory Region, then they are responsible for not
free()ing a portion of it.


Agreed.  If an application does that, it deserves what it gets.

Would the MPI library ever create a single large memory region based on
two distinct Sends?



Per my prior mail, Open MPI registers chucks at a time. Each chunk is potentially a multiple of pages. So yes, you could end up having a single registration that spans the buffers used in multiple, distinct MPI sends. We reference count by page to ensure that deregistrations do not occur prematurely.

For example, if page X contains the end of one large buffer and the beginning of another, both of which are being used in ongoing non- blocking MPI communications. Then page X's entry on our cache will have a refcount == 2. OMPI won't allow the registration containing that page to become eligible for deregistering until the cache entry's refcount goes down to 0.

See my prior mail for a more complex example of our cache's behavior.

--
Jeff Squyres
Cisco Systems

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to