Dear Reuti and Hugh,

thank you for your quick replies.

I have changed control_slaves to TRUE and job_is_first_task to FALSE,
but it did not help.

on my side:
~# ompi_info | grep grid
      MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component v2.0.0)
There is no other openmpi installation on the master and the slaves.

qconf -sconfl
I removed the slaves, now it only contains the master:
~# qconf -sconfl
karun

But all of these did not help. Is there a way to hardcode gridengine not
to use ssh and use qrsh or whatever?

With kind regards, ulrich




On 08/10/2016 09:48 PM, Reuti wrote:
> Hi,
> 
> Am 10.08.2016 um 21:15 schrieb Ulrich Hiller:
> 
>> Hello,
>>
>> My problem: How can i make gridengine not to use ssh?
>>
>> Installed:
>> openmpi-2.0.0 - configured with sge support.
> 
> can you please execute:
> 
> $ ompi_info | grep grid
>                  MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component 
> v2.0.0)
> 
> 
>> gridengine (son of gridengine) 8.1.9-1
>>
>> I have a simple openmpi program 'teste' which only gives "hello world"
>> output.
>> I start it with:
>> qsub -pe orte 160 -V -j yes -cwd -S /bin/bash <<< "mpiexec -n 160 teste
> 
> For testing purpose you can just use:
> 
> $ qsub -pe orte 160 -j yes -cwd -S /bin/bash <<< "mpiexec hostname"
> 
> It will detect the number of granted slots automatically. I usually suggest 
> not to use "-V" as one can never be sure what the actual environment will do 
> to the job, and even worse in case the job crashes one week after the 
> submission, one won't recall that settings at submission time. I put all 
> necessary definition of environment variables in the jobscript instead.
> 
> 
>> login_shells                 sh,bash,ksh,csh,tcsh
>> qlogin_command               builtin
>> qlogin_daemon                builtin
>> rlogin_command               builtin
>> rlogin_daemon                builtin
>> rsh_command                  builtin
>> rsh_daemon                   builtin
> 
> In principle this can be redfined on a per node basis. Do you gave custom 
> settings, i.e.:
> 
> $ qconf -sconfl
> 
> If yes, what's inside? In case all nodes are the same you can even remove the 
> individual configuration with `qconf -dconf node001` and alike.
> 
> 
>> (as far as i read this was the problem of a thread some time ago in this
>> list. But i seem to have the correct values)
>>
>> So everything should be fine- or not?
>> Also with
>> qlogin -l 'h=exec01'
>> and
>> qrsh -l 'h=exec01'
>> i can go without problems to the first node.(called exec01), and i can
>> also login to all other execute nodes as well.
>>
>> Is there anywhere another 'switch' where i can let qsub run _not_ over ssh?
>>
>> If is is of interest, the output  of 'qconf -sp orte' is:
>> pe_name            orte
>> slots              9999999
>> user_lists         NONE
>> xuser_lists        NONE
>> start_proc_args    NONE
>> stop_proc_args     NONE
>> allocation_rule    $round_robin
>> control_slaves     FALSE
> 
> The above entry must be set to TRUE to allow a `qrsh -inherit ...` to connect 
> to the slave nodes.
> 
> -- Reuti
> 
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to