Caitlin Bestler wrote:

But the broader question is what the goal is here. Allowing memory to
be shuffled is valuable, and perhaps even ultimately a requirement for
high availability systems. RDMA and other direct-access APIs should
be evolving their interfaces to accommodate these needs.

Oversubscribing memory is a totally different matter. If an application
is working with memory that is oversubscribed by a factor of 2 or more
can it really benefit from zero-copy direct placement? At first glance I
can't see what RDMA could be bringing of value when the overhead of
swapping is going to be that large.


A related use case from HPC.  Some of us have batch scheduling
systems based on suspend/resume of jobs (which is really just
SIGSTOP and SIGCONT of all job processes).  The value of this
system is enhanced greatly by being able to page out the suspended
job (just normal Linux demand paging caused by the incoming job is
OK).  Apart from this (relatively) brief period of paging, both
jobs benefit from RDMA.

SGI kindly implemented a /proc mechanism for unpinning of XPMEM
pages to allow suspended jobs to be paged on their Altix system.

Note that this use case would not benefit from Pete Wyckoff's
approach of notifying user applications/libraries of VM changes.

And one of the grand goal of HPC developers has always been to have
checkpoint/restart of jobs ....

David
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to