Excellent points Ken; thanks!
I expanded the FAQ entry here to include these points:
http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork
On Nov 30, 2010, at 9:52 AM, Ken Cain wrote:
> Hi Jeff,
>
> We have had some recent experience with this in an Open MPI 1.4.x version and
> thought it would be useful to contribute to the discussion. Please see below.
>
> Jeff Squyres wrote:
>> On Nov 29, 2010, at 6:25 PM, George Bosilca wrote:
>>> The main problem is that openib require to pin memory pages in order to
>>> take advantage of RMA features. There is a major issues with these pinned
>>> pages and fork, leading to segmentation fault in some specific cases.
>>> However, we only pin the pages on the MPI calls related to data transfers.
>>> Therefore, if you call fork __before__ any other MPI data transfer function
>>> (but after MPI_Init as you use the process rank), your application should
>>> be safe.
>> Note that Open MPI also pins some internal memory during MPI_INIT, but that
>> memory is totally internal to libmpi, so you should be safe (i.e., you
>> should never be able to find it and therefore never be able to try to touch
>> it).
>
> This is what we believe happened in our testing:
>
> 1. MPI_init allocated and pinned down some memory. This memory was 64 byte
> aligned and not page-aligned to 4096 bytes. So an allocation that ideally
> should have resulted in 2 pages being pinned, actually had 3 pages pinned
> with lots of unused memory on the 3rd page.
>
> 2. A child process created via popen tried to allocate some memory (perhaps a
> byproduct of popen execution itself) and was allocated memory on that last
> page with lots of unused memory. When the child tried to touch the
> allocation, there was seg fault.
>
> We could reduce the probability of this happenning by changing the alignment
> of MPI allocations to 4096 bytes. But since MPI allocations are not sized to
> be multiple of page size, this isn't a foolproof method.
>
> One way (agreed not ideal) to avoid the potential seg fault is to set the MCA
> parameter btl_openib_want_fork_suppoort = 0. But then you are "trusting" any
> child processes to not intentionally or as a result of a bug, touch the
> memory regions that have been registered/pinned by the parent.
>
>>>> How can one be sure that the disabling the warning is ok? Could you please
>>>> elaborate on what makes forks vulnerable? May be that will guide the
>>>> developers to make an informed decision on whether to disable them or find
>>>> another alternative.
>>> No way to know at 100%. Now for an elaborate answer: Once upon a time ...
>>> The fork story is a long and boring one, we would all have preferred to
>>> never heard about it (believe me). A quick and compressed version can be
>>> found on the QLogic download page
>>> (http://filedownloads.qlogic.com/files/driver/70277/release_QLogicIB-Basic_4400_Rev_A.html).
>> That's a good summary. The issue is with OFED itself, not with Open MPI.
>> Note, too, that calling popen() should also be safe (even though we'll warn
>> about it -- our atfork hook has no way of knowing whether you're calling
>> system, popen, or something else).
>
> Thanks,
>
> -Ken
> --
> Ken Cain
> Mercury Computer Systems, Inc. (http://www.mc.com)
>
> This message is intended only for the designated recipient(s) and may
> contain confidential or proprietary information of Mercury Computer
> Systems, Inc. This message is solely intended to facilitate business
> discussions and does not constitute an express or implied offer to sell
> or purchase any products, services, or support. Any commitments must be
> made in writing and signed by duly authorized representatives of each
> party.
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jeff Squyres
[email protected]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/