Ralph Castain wrote:
As has frequently been commented upon at one time or another, the
shared memory backing file can be quite huge. There used to be a
param for controlling this size, but I can't find it in 1.3 - or at
least, the name or method for controlling file size has morphed into
something I don't recognize.
Can someone more familiar with that subsystem point me to one or more
params that will allow us to control the size of that file? It is
swamping our systems and causing OMPI to segfault.
Sounds like you've already gotten your answers, but I'll add my $0.02
anyhow.
The file size is the number of local processes (call it n) times
mpool_sm_per_peer_size (default 32M), but with a minimum of
mpool_sm_min_size (default 128M) and a maximum of mpool_sm_max_size
(default 2G? 256M?). So, you can tweak those parameters to control
file size.
Another issue is possibly how small a backing file you can get away
with. That is, just forcing the file to be smaller may not be enough
since your job may no longer run. The backing file seems to be used
mainly by:
*) eager-fragment free lists: We start with enough eager fragments so
that we could have two per connection. So, you could bump the sm eager
size down if you need to shoehorn a job into a very small backing file.
*) large-fragment free lists: We start with 8*n large fragments. If
this term plagues you, you can bump the sm chunk size down or reduce the
value of 8 (using btl_sm_free_list_num, I think).
*) FIFOs: The code tries to align a number of things on pagesize
boundaries, so you end up with about 3*n*n*pagesize overhead here. If
this term is causing you problems, you're stuck (unless you modify OMPI).
I'm interested in this subject! :^)