That would be something @Ralph Castain <r...@open-mpi.org> needs to be
looking at as he declared in a previous discussion that `lo` was the
default for PMIX and we now have 2 reports stating otherwise.

George.


On Mon, Feb 5, 2024 at 3:15 PM John Haiducek <jhaid...@gmail.com> wrote:

> Adding '--pmixmca ptl_tcp_if_include lo0' to the mpirun argument list
> seems to fix (or at least work around) the problem.
>
> On Mon, Feb 5, 2024 at 1:49 PM John Haiducek <jhaid...@gmail.com> wrote:
>
>> Thanks, George, that issue you linked certainly looks potentially related.
>>
>> Output from ompi_info:
>>
>>                  Package: Open MPI brew@Monterey-arm64.local Distribution
>>                 Open MPI: 5.0.1
>>   Open MPI repo revision: v5.0.1
>>    Open MPI release date: Dec 20, 2023
>>                  MPI API: 3.1.0
>>             Ident string: 5.0.1
>>                   Prefix: /opt/homebrew/Cellar/open-mpi/5.0.1
>>  Configured architecture: aarch64-apple-darwin21.6.0
>>            Configured by: brew
>>            Configured on: Wed Dec 20 22:18:10 UTC 2023
>>           Configure host: Monterey-arm64.local
>>   Configure command line: '--disable-debug'
>> '--disable-dependency-tracking'
>>                           '--prefix=/opt/homebrew/Cellar/open-mpi/5.0.1'
>>
>> '--libdir=/opt/homebrew/Cellar/open-mpi/5.0.1/lib'
>>                           '--disable-silent-rules' '--enable-ipv6'
>>                           '--enable-mca-no-build=reachable-netlink'
>>                           '--sysconfdir=/opt/homebrew/etc'
>>                           '--with-hwloc=/opt/homebrew/opt/hwloc'
>>                           '--with-libevent=/opt/homebrew/opt/libevent'
>>                           '--with-pmix=/opt/homebrew/opt/pmix'
>> '--with-sge'
>>                 Built by: brew
>>                 Built on: Wed Dec 20 22:18:10 UTC 2023
>>               Built host: Monterey-arm64.local
>>               C bindings: yes
>>              Fort mpif.h: yes (single underscore)
>>             Fort use mpi: yes (full: ignore TKR)
>>        Fort use mpi size: deprecated-ompi-info-value
>>         Fort use mpi_f08: yes
>>  Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
>>                           limitations in the gfortran compiler and/or Open
>>                           MPI, does not support the following: array
>>                           subsections, direct passthru (where possible) to
>>                           underlying Open MPI's C functionality
>>   Fort mpi_f08 subarrays: no
>>            Java bindings: no
>>   Wrapper compiler rpath: unnecessary
>>               C compiler: clang
>>      C compiler absolute: clang
>>   C compiler family name: CLANG
>>       C compiler version: 14.0.0 (clang-1400.0.29.202)
>>             C++ compiler: clang++
>>    C++ compiler absolute: clang++
>>            Fort compiler: gfortran
>>        Fort compiler abs: /opt/homebrew/opt/gcc/bin/gfortran
>>          Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
>>    Fort 08 assumed shape: yes
>>       Fort optional args: yes
>>           Fort INTERFACE: yes
>>     Fort ISO_FORTRAN_ENV: yes
>>        Fort STORAGE_SIZE: yes
>>       Fort BIND(C) (all): yes
>>       Fort ISO_C_BINDING: yes
>>  Fort SUBROUTINE BIND(C): yes
>>        Fort TYPE,BIND(C): yes
>>  Fort T,BIND(C,name="a"): yes
>>             Fort PRIVATE: yes
>>            Fort ABSTRACT: yes
>>        Fort ASYNCHRONOUS: yes
>>           Fort PROCEDURE: yes
>>          Fort USE...ONLY: yes
>>            Fort C_FUNLOC: yes
>>  Fort f08 using wrappers: yes
>>          Fort MPI_SIZEOF: yes
>>              C profiling: yes
>>    Fort mpif.h profiling: yes
>>   Fort use mpi profiling: yes
>>    Fort use mpi_f08 prof: yes
>>           Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support:
>> yes,
>>                           OMPI progress: no, Event lib: yes)
>>            Sparse Groups: no
>>   Internal debug support: no
>>   MPI interface warnings: yes
>>      MPI parameter check: runtime
>> Memory profiling support: no
>> Memory debugging support: no
>>               dl support: yes
>>    Heterogeneous support: no
>>        MPI_WTIME support: native
>>      Symbol vis. support: yes
>>    Host topology support: yes
>>             IPv6 support: yes
>>           MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat
>>  Fault Tolerance support: yes
>>           FT MPI support: yes
>>   MPI_MAX_PROCESSOR_NAME: 256
>>     MPI_MAX_ERROR_STRING: 256
>>      MPI_MAX_OBJECT_NAME: 64
>>         MPI_MAX_INFO_KEY: 36
>>         MPI_MAX_INFO_VAL: 256
>>        MPI_MAX_PORT_NAME: 1024
>>   MPI_MAX_DATAREP_STRING: 128
>>          MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.0.1)
>>            MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>            MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>            MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>                  MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>>                  MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>>                  MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>>                   MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component
>> v5.0.1)
>>                   MCA if: bsdx_ipv6 (MCA v2.1.0, API v2.0.0, Component
>>                           v5.0.1)
>>                   MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
>>                           v5.0.1)
>>          MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>          MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>                MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component
>> v5.0.1)
>>              MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
>>                           v5.0.1)
>>               MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>>            MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>                MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>                MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>                MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>              MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component
>> v5.0.1)
>>                MCA timer: darwin (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>                  MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>>                 MCA coll: adapt (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA coll: basic (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA coll: han (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA coll: inter (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA coll: libnbc (MCA v2.1.0, API v2.4.0, Component
>> v5.0.1)
>>                 MCA coll: self (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA coll: sync (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA coll: tuned (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA coll: ftagree (MCA v2.1.0, API v2.4.0, Component
>> v5.0.1)
>>                 MCA coll: monitoring (MCA v2.1.0, API v2.4.0, Component
>>                           v5.0.1)
>>                 MCA coll: sm (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>>                 MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>                MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>                MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
>>                           v5.0.1)
>>                MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
>>                           v5.0.1)
>>                MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>                   MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>                 MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component
>>                           v5.0.1)
>>                   MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>                   MCA io: romio341 (MCA v2.1.0, API v2.0.0, Component
>> v5.0.1)
>>                  MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v5.0.1)
>>                  MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component
>>                           v5.0.1)
>>                  MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v5.0.1)
>>                 MCA part: persist (MCA v2.1.0, API v4.0.0, Component
>> v5.0.1)
>>                  MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>>                  MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component
>>                           v5.0.1)
>>                  MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>>                  MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>>             MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
>>                           v5.0.1)
>>             MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
>>                           v5.0.1)
>>             MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>>                 MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.0.1)
>>                 MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
>>                           v5.0.1)
>>            MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
>>                           v5.0.1)
>>
>> On Mon, Feb 5, 2024 at 12:48 PM George Bosilca <bosi...@icl.utk.edu>
>> wrote:
>>
>>> OMPI seems unable to create a communication medium between your
>>> processes. There are few known issues on OSX, please read
>>> https://github.com/open-mpi/ompi/issues/12273 for more info.
>>>
>>> Can you provide the header of the ompi_info command. What I'm interested
>>> on is the part about `Configure command line:`
>>>
>>> George.
>>>
>>>
>>> On Mon, Feb 5, 2024 at 12:18 PM John Haiducek via users <
>>> users@lists.open-mpi.org> wrote:
>>>
>>>> I'm having problems running programs compiled against the OpenMPI 5.0.1
>>>> package provided by homebrew on MacOS (arm) 12.6.1.
>>>>
>>>> When running a Fortran test program that simply calls MPI_init followed
>>>> by MPI_Finalize, I get the following output:
>>>>
>>>> $ mpirun -n 2 ./mpi_init_test
>>>>
>>>> --------------------------------------------------------------------------
>>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>>> likely to abort.  There are many reasons that a parallel process can
>>>> fail during MPI_INIT; some of which are due to configuration or
>>>> environment
>>>> problems.  This failure appears to be an internal failure; here's some
>>>> additional information (which may only be relevant to an Open MPI
>>>> developer):
>>>>
>>>>   PML add procs failed
>>>>   --> Returned "Not found" (-13) instead of "Success" (0)
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> --------------------------------------------------------------------------
>>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>>> likely to abort.  There are many reasons that a parallel process can
>>>> fail during MPI_INIT; some of which are due to configuration or
>>>> environment
>>>> problems.  This failure appears to be an internal failure; here's some
>>>> additional information (which may only be relevant to an Open MPI
>>>> developer):
>>>>
>>>>   ompi_mpi_init: ompi_mpi_instance_init failed
>>>>   --> Returned "Not found" (-13) instead of "Success" (0)
>>>>
>>>> --------------------------------------------------------------------------
>>>> [haiducek-lt:00000] *** An error occurred in MPI_Init
>>>> [haiducek-lt:00000] *** reported by process [1905590273,1]
>>>> [haiducek-lt:00000] *** on a NULL communicator
>>>> [haiducek-lt:00000] *** Unknown error
>>>> [haiducek-lt:00000] *** MPI_ERRORS_ARE_FATAL (processes in this
>>>> communicator will now abort,
>>>> [haiducek-lt:00000] ***    and MPI will try to terminate your MPI job
>>>> as well)
>>>>
>>>> --------------------------------------------------------------------------
>>>> prterun detected that one or more processes exited with non-zero status,
>>>> thus causing the job to be terminated. The first process to do so was:
>>>>
>>>>    Process name: [prterun-haiducek-lt-15584@1,1] Exit code:    14
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> I'm not sure whether this is the result of a bug in OpenMPI, in the
>>>> homebrew package, or a misconfiguration of my system. Any suggestions for
>>>> troubleshooting this?
>>>>
>>>

Reply via email to