Thanks, Jeff.

OK, I've found the offending code and gotten rid of the fork() warning.  I'm
still left with this:

[bl302:26556] *** An error occurred in MPI_Cart_create
[bl302:26556] *** on communicator MPI_COMM_WORLD
[bl302:26556] *** MPI_ERR_ARG: invalid argument of some other kind
[bl302:26556] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 13693 on
node bl316 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[bl316:13691] 7 more processes have sent help message help-mpi-errors.txt /
mpi_errors_are_fatal
[bl316:13691] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
help / error messages

I'm going to try re-compiling OpenMPI, itself, with the Intel compilers.
Any other ideas?


On Wed, Sep 2, 2009 at 12:01 AM, Jeff Squyres <jsquy...@cisco.com> wrote:

> *Something* in your code is calling fork() -- it may be an indirect call
> such as system() or popen() or somesuch.  This particular error message is
> only printed during a "fork pre-hook" that Open MPI installs during MPI_INIT
> (registered via pthread_atfork()).
>
> Grep through your code for calls to system and popen -- see if any of these
> are used.
>
> There is no functional difference between "include 'mpif.h'" and "use mpi"
> in terms of MPI functionality at run time -- the only difference you get is
> a "better" level of compile-time protection from the Fortran compiler.
>  Specifically, "use mpi" will introduce strict type checking for many (but
> not all) of the MPI functions at compile time.  Hence, the compiler will
> complain if you forget an IERR parameter to an MPI function, for example.
>
> "use mpi" is not perfect, though -- there are many well-documented problems
> because of the design of the MPI-2 Fortran 90 interface (which are currently
> being addressed in MPI-3, if you care :-) ).  More generally: "use mpi" will
> catch *many* compile errors, but not *all* of them.
>
> But to answer your question succinctly: this problem won't be affected by
> using "use mpi" or "include 'mpif.h'".
>
>
>
>
> On Sep 1, 2009, at 9:02 PM, Greg Fischer wrote:
>
>  I'm receiving the error posted at the bottom of this message with a code
>> compiled with Intel Fortran/C Version 11.1 against OpenMPI version 1.3.2.
>>
>> The same code works correctly when compiled against MPICH2.  (We have
>> re-compiled with OpenMPI to take advantage of newly-installed Infiniband
>> hardware.  The "ring" test problem appears to work correctly over
>> Infiniband.)
>>
>> There are no "fork()" calls in our code, so I can only guess that
>> something weird is going on with MPI_COMM_WORLD.  The code in question is a
>> Fortran 90 code.  Right now, it is being compiled with "include 'mpif.h'"
>> statements at the beginning of each MPI subroutine, instead of  making use
>> of the "mpi" modules.  Could this be causing the problem?  How else should I
>> go about diagnosing the problem?
>>
>> Thanks,
>> Greg
>>
>> --------------------------------------------------------------------------
>> An MPI process has executed an operation involving a call to the
>> "fork()" system call to create a child process.  Open MPI is currently
>> operating in a condition that could result in memory corruption or
>> other system errors; your MPI job may hang, crash, or produce silent
>> data corruption.  The use of fork() (or system() or other calls that
>> create child processes) is strongly discouraged.
>>
>> The process that invoked fork was:
>>
>>  Local host:          bl316 (PID 26806)
>>  MPI_COMM_WORLD rank: 0
>>
>> If you are *absolutely sure* that your application will successfully
>> and correctly survive a call to fork(), you may disable this warning
>> by setting the mpi_warn_on_fork MCA parameter to 0.
>> --------------------------------------------------------------------------
>> [bl205:5014] *** An error occurred in MPI_Cart_create
>> [bl205:5014] *** on communicator MPI_COMM_WORLD
>> [bl205:5014] *** MPI_ERR_ARG: invalid argument of some other kind
>> [bl205:5014] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>
>> --------------------------------------------------------------------------
>> mpirun has exited due to process rank 4 with PID 5010 on
>> node bl205 exiting without calling "finalize". This may
>> have caused other processes in the application to be
>> terminated by signals sent by mpirun (as reported here).
>> --------------------------------------------------------------------------
>> [bl205:05008] 7 more processes have sent help message help-mpi-errors.txt
>> / mpi_errors_are_fatal
>> [bl205:05008] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
>> help / error messages
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to