Hi,

Am 10.08.2016 um 21:15 schrieb Ulrich Hiller:

> Hello,
> 
> My problem: How can i make gridengine not to use ssh?
> 
> Installed:
> openmpi-2.0.0 - configured with sge support.

can you please execute:

$ ompi_info | grep grid
                 MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component v2.0.0)


> gridengine (son of gridengine) 8.1.9-1
> 
> I have a simple openmpi program 'teste' which only gives "hello world"
> output.
> I start it with:
> qsub -pe orte 160 -V -j yes -cwd -S /bin/bash <<< "mpiexec -n 160 teste

For testing purpose you can just use:

$ qsub -pe orte 160 -j yes -cwd -S /bin/bash <<< "mpiexec hostname"

It will detect the number of granted slots automatically. I usually suggest not 
to use "-V" as one can never be sure what the actual environment will do to the 
job, and even worse in case the job crashes one week after the submission, one 
won't recall that settings at submission time. I put all necessary definition 
of environment variables in the jobscript instead.


> login_shells                 sh,bash,ksh,csh,tcsh
> qlogin_command               builtin
> qlogin_daemon                builtin
> rlogin_command               builtin
> rlogin_daemon                builtin
> rsh_command                  builtin
> rsh_daemon                   builtin

In principle this can be redfined on a per node basis. Do you gave custom 
settings, i.e.:

$ qconf -sconfl

If yes, what's inside? In case all nodes are the same you can even remove the 
individual configuration with `qconf -dconf node001` and alike.


> (as far as i read this was the problem of a thread some time ago in this
> list. But i seem to have the correct values)
> 
> So everything should be fine- or not?
> Also with
> qlogin -l 'h=exec01'
> and
> qrsh -l 'h=exec01'
> i can go without problems to the first node.(called exec01), and i can
> also login to all other execute nodes as well.
> 
> Is there anywhere another 'switch' where i can let qsub run _not_ over ssh?
> 
> If is is of interest, the output  of 'qconf -sp orte' is:
> pe_name            orte
> slots              9999999
> user_lists         NONE
> xuser_lists        NONE
> start_proc_args    NONE
> stop_proc_args     NONE
> allocation_rule    $round_robin
> control_slaves     FALSE

The above entry must be set to TRUE to allow a `qrsh -inherit ...` to connect 
to the slave nodes.

-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to