I have built the myrinet drivers with gcc or the studio 11 compilers from sun. The following problem appears for both installations.
I have tested the myrinet installations using myricoms own test programs. Then I build open-mpi using the studio11 compilers enabling myrinet. All the library paths are correctly set and I can run my test program which is written in C, successfully, if I choose the number of CPUs to be equal the number of nodes, which means one instance of process per node! Each node has 4 CPUs. If I now request the number of CPUs for the run to be larger than the number of nodes I get an error message, which clearly indicates that the openmpi cannot communicate over more than one channel on the myrinet card. However I should be able to communicate over 4 channels at least - colleagues of mine are doing that using mpich and the same type of myrinet card. Any idead why this should happen? the hostfile looks like: m2009 slots=4 m2010 slots=4 but it will provide the same error if the hosts file is m2009 m2010 ompi_info | grep mx 2001(128) > ompi_info | grep mx MCA btl: mx (MCA v1.0, API v1.0.1, Component v1.2) MCA mtl: mx (MCA v1.0, API v1.0, Component v1.2) m2009(160) > /opt/mx/bin/mx_endpoint_info 1 Myrinet board installed. The MX driver is configured to support up to 4 endpoints on 4 boards. =================================================================== Board #0: Endpoint PID Command Info <raw> 15039 0 15544 There are currently 1 regular endpoint open m2001(120) > mpirun -np 6 -hostfile hostsfile -mca btl mx,self b_eff -------------------------------------------------------------------------- Process 0.1.0 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.2 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.4 is unable to reach 0.1.4 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.1 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.5 is unable to reach 0.1.4 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.3 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) *** An error occurred in MPI_Init *** before MPI was initialized -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) *** MPI_ERRORS_ARE_FATAL (goodbye) m2001(121) > mpirun -np 4 -hostfile hostsfile -mca btl mx b_eff -------------------------------------------------------------------------- Process 0.1.0 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.1 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.2 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- Process 0.1.3 is unable to reach 0.1.0 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. -------------------------------------------------------------------------- -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0) -------------------------------------------------------------------------- *** An error occurred in MPI_Init *** before MPI was initialized *** MPI_ERRORS_ARE_FATAL (goodbye) ------------------------------------------ Dr E L Heck University of Durham Institute for Computational Cosmology Ogden Centre Department of Physics South Road DURHAM, DH1 3LE United Kingdom e-mail: lydia.h...@durham.ac.uk Tel.: + 44 191 - 334 3628 Fax.: + 44 191 - 334 3645 ___________________________________________