Hi all,
A new run-time test for System V shared memory support has been added
- hopefully addressing all of the concerns voiced earlier in this
thread. If not, please let me know. I've tested the changes on
various Linux systems, OS X, and Solaris 10.
Testing is always appreciated!
http://bitbucket.org/samuelkgutierrez/ompi_sysv_sm
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
On May 5, 2010, at 7:53 AM, Samuel K. Gutierrez wrote:
On May 5, 2010, at 6:10 AM, Jeff Squyres wrote:
On May 4, 2010, at 9:53 AM, Ashley Pittman wrote:
Point noted. But actually -- can you give specific reasons as to
why a user should care? Keep in mind that this would be a short-
lived fork'ed process -- not "spawn" in the MPI sense of the word.
You might be running the job under Valgrind or another debugger,
bclr has some issues with fork as I remember and traditionally
there have been IB mapping issues here as well. I'm sure you
could make a case against any of those points if you wanted to but
I think the argument stands, doing this kind of run-time check
shouldn't be needed.
Mmm; good points (especially Valgrind). BLCR and OpenFabrics verbs
shouldn't be much of an issue here, but I can see that there might
be unexpectedness if you're running under Valgrind or some other
debugger.
It might be possible to construct the code however so that if it
failed to initialise it just wasn't used rather than aborted the
job which would have much the same effect as a run-time test but
without having to fork new processes and create short-lived shared
memory regions.
That's how most of the network transports are in OMPI today -- if
they fail to init, they are just skipped.
The problem here is that you really need 2 processes to do this
test. I suppose it could be done with local ranks 0 and 1 instead
of forking a new process -- they would just need to communicate via
RML to sync up, I suppose.
I need to think about it a little more, but I like this solution.
Thanks,
--
Samuel K. Gutierrez
Los Alamos National Laboratory
I should of course said fork where I mentioned spawn above to
avoid any confusion, spawn has a specific meaning in the context
of MPI.
I still think a better understanding of the issue is required
before any decision here is made though, I'm surprised by Samuels
description of the problem because it's not how I remember it and
from what Chris says it doesn't reflect what is in linux Git code
either. I'd like to see why there is an apparent difference in
behaviour before a decision is made to only support one.
There's no intent to only support sysv or mmap. Samuel's work was
to extend OMPI to support sysv in the case where it would be
advantageous (e.g., guaranteed cleanup of the shmem segment). The
mmap stuff is definitely not going to be removed.
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel