Ralph

I had done a make clean in the 1.2.8 directory if that is what you meant ?
Or do I need to do something else ?

I appreciate your help on this by the way ;-)


  ----- Original Message ----- 
  From: Ralph Castain 
  To: Open MPI Users 
  Sent: Monday, February 13, 2012 3:41 PM
  Subject: Re: [OMPI users] MPI orte_init fails on remote nodes


  You need to clean out the old attempt - that is a stale file

  Sent from my iPad

  On Feb 13, 2012, at 7:36 AM, "Richard Bardwell" <rich...@sharc.co.uk> wrote:


    OK, I installed 1.4.4, rebuilt the exec and guess what ...... I now get 
some weird errors as below:
    mca: base: component_find: unable to open 
/usr/local/lib/openmpi/mca_ras_dash_host
    along with a few other files
    even though the .so / .la files are all there !
      ----- Original Message ----- 
      From: Ralph Castain 
      To: Open MPI Users 
      Sent: Monday, February 13, 2012 2:59 PM
      Subject: Re: [OMPI users] MPI orte_init fails on remote nodes


      Good heavens - where did you find something that old? Can you use a more 
recent version?

      Sent from my iPad


       
        Gentlemen

        I am struggling to get MPI working when the hostfile contains different 
nodes.

        I get the error below. Any ideas ?? I can ssh without password between 
the two

        nodes. I am running 1.2.8 MPI on both machines.

        Any help most appreciated !!!!!



        MPITEST/v8_mpi_test> mpiexec -n 2 --debug-daemons -hostfile test.hst 
/home/sharc/MPITEST/v8_mpi_test/mpitest

        Daemon [0,0,1] checking in as pid 10490 on host 192.0.2.67

        [linux-z0je:08804] [NO-NAME] ORTE_ERROR_LOG: Not found in file 
runtime/orte_init_stage1.c at line 182

        
--------------------------------------------------------------------------

        It looks like orte_init failed for some reason; your parallel process is

        likely to abort. There are many reasons that a parallel process can

        fail during orte_init; some of which are due to configuration or

        environment problems. This failure appears to be an internal failure;

        here's some additional information (which may only be relevant to an

        Open MPI developer):

        orte_rml_base_select failed

        --> Returned value -13 instead of ORTE_SUCCESS

        
--------------------------------------------------------------------------

        [linux-z0je:08804] [NO-NAME] ORTE_ERROR_LOG: Not found in file 
runtime/orte_system_init.c at line 42

        [linux-z0je:08804] [NO-NAME] ORTE_ERROR_LOG: Not found in file 
runtime/orte_init.c at line 52

        Open RTE was unable to initialize properly. The error occured while

        attempting to orte_init(). Returned value -13 instead of ORTE_SUCCESS.

        [linux-tmpw:10490] [0,0,1] orted_recv_pls: received message from [0,0,0]

        [linux-tmpw:10490] [0,0,1] orted_recv_pls: received kill_local_procs

        [linux-tmpw:10489] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 275

        [linux-tmpw:10489] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_rsh_module.c at line 1158

        [linux-tmpw:10489] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c 
at line 90

        [linux-tmpw:10489] ERROR: A daemon on node 192.0.2.68 failed to start 
as expected.

        [linux-tmpw:10489] ERROR: There may be more information available from

        [linux-tmpw:10489] ERROR: the remote shell (see above).

        [linux-tmpw:10489] ERROR: The daemon exited unexpectedly with status 
243.

        [linux-tmpw:10490] [0,0,1] orted_recv_pls: received message from [0,0,0]

        [linux-tmpw:10490] [0,0,1] orted_recv_pls: received exit

        [linux-tmpw:10489] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 188

        [linux-tmpw:10489] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_rsh_module.c at line 1190

        
--------------------------------------------------------------------------

        mpiexec was unable to cleanly terminate the daemons for this job. 
Returned value Timeout instead of ORTE_SUCCESS.

        
--------------------------------------------------------------------------

        _______________________________________________
        users mailing list
        us...@open-mpi.org
        http://www.open-mpi.org/mailman/listinfo.cgi/users


--------------------------------------------------------------------------


      _______________________________________________
      users mailing list
      us...@open-mpi.org
      http://www.open-mpi.org/mailman/listinfo.cgi/users
    _______________________________________________
    users mailing list
    us...@open-mpi.org
    http://www.open-mpi.org/mailman/listinfo.cgi/users


------------------------------------------------------------------------------


  _______________________________________________
  users mailing list
  us...@open-mpi.org
  http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to