Aargh, that was the reason. I compiled openmpi 1.10.3,and now it works
too at my side.
Thanks a lot for the hint.

Now a hint from my side if somebody runs into the same problem: If you
have compiled the program with mpicc-2.0.0, you'll propably now get an
error like:

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "(null)" (-43) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[karun01:26952] Local abort before MPI_INIT completed completed
successfully, but am not able to aggregate error messages, and not able
to guarantee that all other processes were killed!


Just recompile the program with mpicc-1.10.3 and it will work.

Again thanks a lot for everybody who tried to help, especially Reuti,
Hugh and William.

With kind reagrds, ulrich

On 08/11/2016 12:16 PM, Reuti wrote:
> I just compiled openmpi-2.0.0 on my own and it looks like a regression to use 
> `ssh` although it's running under SGE. Also for `mpicc` it was necessary to 
> supply "-ldl" to succeed, this wasn't necessary in former versions.
> 
> I'll look into it.
> 
> For now I think it's best to stay with 1.10.3.
> 
> Note that after 1.6.5 they do a core binding (bad in case if several Open MPI 
> jobs are running on one and the same node, as all will use core 0 upwards) 
> and check the network topology. If it's set up with dead routes/interfaces 
> (which normally won't matter), the startup of the parallel job may be delayed 
> by one to two minutes (until they face a timeout).
> 
> -- Reuti
> 
> 
>> Am 10.08.2016 um 21:15 schrieb Ulrich Hiller <hil...@mpia-hd.mpg.de>:
>>
>> Hello,
>>
>> My problem: How can i make gridengine not to use ssh?
>>
>> Installed:
>> openmpi-2.0.0 - configured with sge support.
>> gridengine (son of gridengine) 8.1.9-1
>>
>> I have a simple openmpi program 'teste' which only gives "hello world"
>> output.
>> I start it with:
>> qsub -pe orte 160 -V -j yes -cwd -S /bin/bash <<< "mpiexec -n 160 teste
>>>> /home/ljohndoe/out.dat"
>> on the master node.
>> I get back the error:
>>
>> Host key verification failed.
>> Host key verification failed.
>> Permission denied, please try again.
>> Permission denied, please try again.
>> Received disconnect from 192.168.117.6: 2: Too many authentication
>> failures for johndoe
>> Permission denied, please try again.
>> Permission denied, please try again.
>> Received disconnect from 192.168.117.5: 2: Too many authentication
>> failures for johndoe
>> [...]
>>
>> When i configure a passwordless ssh login to the execute nodes
>> (exchanging the ssh key from master with 'ssh-copy-id), it works like
>> charm. So it obviuously uses ssh connection to the execute nodes.
>>
>> the output of  'qconf -sconf' contains:
>>
>> login_shells                 sh,bash,ksh,csh,tcsh
>> qlogin_command               builtin
>> qlogin_daemon                builtin
>> rlogin_command               builtin
>> rlogin_daemon                builtin
>> rsh_command                  builtin
>> rsh_daemon                   builtin
>>
>> (as far as i read this was the problem of a thread some time ago in this
>> list. But i seem to have the correct values)
>>
>> So everything should be fine- or not?
>> Also with
>> qlogin -l 'h=exec01'
>> and
>> qrsh -l 'h=exec01'
>> i can go without problems to the first node.(called exec01), and i can
>> also login to all other execute nodes as well.
>>
>> Is there anywhere another 'switch' where i can let qsub run _not_ over ssh?
>>
>> If is is of interest, the output  of 'qconf -sp orte' is:
>> pe_name            orte
>> slots              9999999
>> user_lists         NONE
>> xuser_lists        NONE
>> start_proc_args    NONE
>> stop_proc_args     NONE
>> allocation_rule    $round_robin
>> control_slaves     FALSE
>> job_is_first_task  TRUE
>> urgency_slots      min
>> accounting_summary FALSE
>> qsort_args         NONE
>>
>> Also, i do not have any ssh lines in ~/.profile or ~/.bashrc
>>
>>
>> Kind regards, ulrich
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@gridengine.org
>> https://gridengine.org/mailman/listinfo/users
> 
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to