Also, latest commit into openib (origin/v1.8 https://svn.open-mpi.org/trac/ompi/changeset/32391) broke something:
*11:45:01* + timeout -s SIGSEGV 3m /scrap/jenkins/workspace/OMPI-vendor/label/hpctest/ompi_install1/bin/mpirun -np 8 -mca pml ob1 -mca btl self,openib /scrap/jenkins/workspace/OMPI-vendor/label/hpctest/ompi_install1/examples/hello_usempi*11:45:01* --------------------------------------------------------------------------*11:45:01* WARNING: There are more than one active ports on host 'hpctest', but the*11:45:01* default subnet GID prefix was detected on more than one of these*11:45:01* ports. If these ports are connected to different physical IB*11:45:01* networks, this configuration will fail in Open MPI. This version of*11:45:01* Open MPI requires that every physically separate IB subnet that is*11:45:01* used between connected MPI processes must have different subnet ID*11:45:01* values.*11:45:01* *11:45:01* Please see this FAQ entry for more details:*11:45:01* *11:45:01* http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid*11:45:01* *11:45:01* NOTE: You can turn off this warning by setting the MCA parameter*11:45:01* btl_openib_warn_default_gid_prefix to 0.*11:45:01* --------------------------------------------------------------------------*11:45:01* --------------------------------------------------------------------------*11:45:01* WARNING: No queue pairs were defined in the btl_openib_receive_queues*11:45:01* MCA parameter. At least one queue pair must be defined. The*11:45:01* OpenFabrics (openib) BTL will therefore be deactivated for this run.*11:45:01* *11:45:01* Local host: hpctest*11:45:01* --------------------------------------------------------------------------*11:45:01* --------------------------------------------------------------------------*11:45:01* At least one pair of MPI processes are unable to reach each other for*11:45:01* MPI communications. This means that no Open MPI device has indicated*11:45:01* that it can be used to communicate between these processes. This is*11:45:01* an error; Open MPI requires that all MPI processes be able to reach*11:45:01* each other. This error can sometimes be the result of forgetting to*11:45:01* specify the "self" BTL.*11:45:01* *11:45:01* Process 1 ([[55281,1],1]) is on host: hpctest*11:45:01* Process 2 ([[55281,1],0]) is on host: hpctest*11:45:01* BTLs attempted: self*11:45:01* *11:45:01* Your MPI job is now going to abort; sorry.*11:45:01* --------------------------------------------------------------------------*11:45:01* --------------------------------------------------------------------------*11:45:01* MPI_INIT has failed because at least one MPI process is unreachable*11:45:01* from another. This *usually* means that an underlying communication*11:45:01* plugin -- such as a BTL or an MTL -- has either not loaded or not*11:45:01* allowed itself to be used. Your MPI job will now abort.*11:45:01* *11:45:01* You may wish to try to narrow down the problem;*11:45:01* *11:45:01* * Check the output of ompi_info to see which BTL/MTL plugins are*11:45:01* available.*11:45:01* * Run your application with MPI_THREAD_SINGLE.*11:45:01* * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,*11:45:01* if using MTL-based communications) to see exactly which*11:45:01* communication plugins were considered and/or discarded.*11:45:01* --------------------------------------------------------------------------*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2761] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2757] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2751] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2752] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2753] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2755] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2759] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!*11:45:01* *** An error occurred in MPI_Init*11:45:01* *** on a NULL communicator*11:45:01* *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*11:45:01* *** and potentially your MPI job)*11:45:01* [hpctest:2763] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed! On Fri, Aug 1, 2014 at 11:00 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Note that the Solaris unresolved alloca problem George fixed in r32388 is > still present in 1.8.2rc3. > I have manually confirmed that the same patch resolves the problem in > 1.8.2rc3. > > -Paul > > > On Thu, Jul 31, 2014 at 9:44 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> Usual place - this is a last-chance check, so please hit it. Main change >> from rc2 is the repairs to the Fortran binding config logic >> >> http://www.open-mpi.org/software/ompi/v1.8/ >> >> Ralph >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/08/15433.php >> > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/08/15440.php >