> Am 11.08.2016 um 12:14 schrieb Ulrich Hiller <[email protected]>: > > Dear Reuti and Hugh, > > thank you for your quick replies. > > I have changed control_slaves to TRUE and job_is_first_task to FALSE, > but it did not help. > > on my side: > ~# ompi_info | grep grid > MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component v2.0.0) > There is no other openmpi installation on the master and the slaves. > > qconf -sconfl > I removed the slaves, now it only contains the master: > ~# qconf -sconfl > karun
Unless the head node is special, it can also be removed. In this case the settings in `qconf -sconf` will be used. For the head node it will only be used in case it's also an exechost AFAICS. The original idea was/is, that in a cluster where machines have different operating systems the paths to the mail application/prolog/epilog might be different or a different load sensor must be used (shows up in `man sge_conf` when searching for "host local"). -- Reuti > But all of these did not help. Is there a way to hardcode gridengine not > to use ssh and use qrsh or whatever? > > With kind regards, ulrich > > > > > On 08/10/2016 09:48 PM, Reuti wrote: >> Hi, >> >> Am 10.08.2016 um 21:15 schrieb Ulrich Hiller: >> >>> Hello, >>> >>> My problem: How can i make gridengine not to use ssh? >>> >>> Installed: >>> openmpi-2.0.0 - configured with sge support. >> >> can you please execute: >> >> $ ompi_info | grep grid >> MCA ras: gridengine (MCA v2.1.0, API v2.0.0, Component >> v2.0.0) >> >> >>> gridengine (son of gridengine) 8.1.9-1 >>> >>> I have a simple openmpi program 'teste' which only gives "hello world" >>> output. >>> I start it with: >>> qsub -pe orte 160 -V -j yes -cwd -S /bin/bash <<< "mpiexec -n 160 teste >> >> For testing purpose you can just use: >> >> $ qsub -pe orte 160 -j yes -cwd -S /bin/bash <<< "mpiexec hostname" >> >> It will detect the number of granted slots automatically. I usually suggest >> not to use "-V" as one can never be sure what the actual environment will do >> to the job, and even worse in case the job crashes one week after the >> submission, one won't recall that settings at submission time. I put all >> necessary definition of environment variables in the jobscript instead. >> >> >>> login_shells sh,bash,ksh,csh,tcsh >>> qlogin_command builtin >>> qlogin_daemon builtin >>> rlogin_command builtin >>> rlogin_daemon builtin >>> rsh_command builtin >>> rsh_daemon builtin >> >> In principle this can be redfined on a per node basis. Do you gave custom >> settings, i.e.: >> >> $ qconf -sconfl >> >> If yes, what's inside? In case all nodes are the same you can even remove >> the individual configuration with `qconf -dconf node001` and alike. >> >> >>> (as far as i read this was the problem of a thread some time ago in this >>> list. But i seem to have the correct values) >>> >>> So everything should be fine- or not? >>> Also with >>> qlogin -l 'h=exec01' >>> and >>> qrsh -l 'h=exec01' >>> i can go without problems to the first node.(called exec01), and i can >>> also login to all other execute nodes as well. >>> >>> Is there anywhere another 'switch' where i can let qsub run _not_ over ssh? >>> >>> If is is of interest, the output of 'qconf -sp orte' is: >>> pe_name orte >>> slots 9999999 >>> user_lists NONE >>> xuser_lists NONE >>> start_proc_args NONE >>> stop_proc_args NONE >>> allocation_rule $round_robin >>> control_slaves FALSE >> >> The above entry must be set to TRUE to allow a `qrsh -inherit ...` to >> connect to the slave nodes. >> >> -- Reuti >> > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
