On Thu, 10 Jun 2010, Paul H. Hargrove wrote:
One should not ignore the option of POSIX shared memory: shm_open() and
shm_unlink(). When present this mechanism usually does not suffer from
the small (eg 32MB) limits of SysV, and uses a "filename" (in an
abstract namespace) which can portably be up 14 characters in length.
Because shm_unlink() may be called as soon as the final process has done
its shm_open() one can get approximately the safety of the IPC_RMID
mechanism, but w/o being restricted to Linux.
I have used POSIX shared memory for another project and found it works
well on Linux, Solaris (10 and Open), FreeBSD and AIX. That is probably
a narrow coverage than SysV, but still worth consideration IMHO.
I was just doing research on shm_open() to ensure it had no limitation
before introducing it in this thread. You saved me some time !
With mmap(), SysV and POSIX (plus XPMEM on the SGI Altix) as mechanisms
for sharing memory between processes, I think we have an argument for a
full-blown "shared pages" framework as opposed to just a "mpi_common_sm"
MCA parameter. That brings all the benefits like possibly "failing
over" from one component to another (otherwise less desired) one if some
limit is exceeded. For instance, SysV could (for a given set of
priorities) be used by default, but mmap-on-real-fs could be
automatically selected when the requested/required size exceeds the
shmmax value.
Would be indeed nice.
As for why mmap is slower. When the file is on a real (not tmpfs or other
ramdisk) I am 95% certain that this is an artifact of the Linux swapper/pager
behavior which is thinking it is being smart by "swapping ahead". Even when
there is no memory pressure that requires swapping, Linux starts queuing swap
I/O for pages to keep the number of "clean" pages up when possible. This
results in pages of the shared memory file being written out to the actual
block device. Both the background I/O and the VM metadata updates contribute
to the lost time. I say 95% certain because I have a colleague who looked
into this phenomena in another setting and I am recounting what he reported
as clearly as I can remember, but might have misunderstood or inserted my own
speculation by accident. A sufficiently motivated investigator (not me)
could probably devise an experiment to verify this.
Interesting. Do you think this behavior of the linux kernel would change
if the file was unlink()ed after attach ?
Sylvain