Hi,

After reading a little the FAQ on the methods used by Open MPI to deal with 
memory registration (or pinning) with Infiniband adapter, it seems that we 
could avoid all the overhead and complexity of memory 
registration/deregistration, registration cache access and update, memory 
management (ummunotify) in addition to allowing a better overlap of the 
communications with the computations (we could let the communication hardware 
do its job independently without resorting to 
registration/transfer/deregistration pipelines) by simply having all user 
process memory registered all the time.

Of course a configuration like that is not appropriate in a general setting 
(ex: a desktop environment) as it would make swapping almost impossible.

But in the context of an HPC node where the processes are not supposed to swap 
and the OS not overcommit memory, not being able to swap doesn't appear to be a 
problem.

Moreover since the maximal total memory used per process is often predefined at 
the application start as a resource specified to the queuing system, the OS 
could easily keep a defined amount of extra memory for its own need instead of 
swapping out user process memory.

I guess that specialized (non-Linux) compute node OS does this.

But is it possible and does it make sense with Linux ?

Thanks,

Martin Audet

Reply via email to