Dear rhc, dear Reuti, thanks for your valuable help!
Kind regards, Ado Arnolds On 22.03.2017 15:55, r...@open-mpi.org wrote: > Sorry folks - for some reason (probably timing for getting 2.1.0 out), the > fix for this got pushed to v2.1.1 - see the PR here: > https://github.com/open-mpi/ompi/pull/3163 > > >> On Mar 22, 2017, at 7:49 AM, Reuti <re...@staff.uni-marburg.de >> <mailto:re...@staff.uni-marburg.de>> wrote: >> >>> >>> Am 22.03.2017 um 15:31 schrieb Heinz-Ado Arnolds >>> <arno...@mpa-garching.mpg.de <mailto:arno...@mpa-garching.mpg.de>>: >>> >>> Dear Reuti, >>> >>> thanks a lot, you're right! But why did the default behavior change but not >>> the value of this parameter: >>> >>> 2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", >>> data source: default, level: 2 user/detail, type: string, synonyms: >>> pls_rsh_agent, orte_rsh_agent) >>> The command used to launch executables on remote >>> nodes (typically either "ssh" or "rsh") >>> >>> 1.10.6: MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", >>> data source: default, level: 2 user/detail, type: string, synonyms: >>> pls_rsh_agent, orte_rsh_agent) >>> The command used to launch executables on remote >>> nodes (typically either "ssh" or "rsh") >>> >>> That means there must have been changes in the code regarding that, perhaps >>> for detecting SGE? Do you know of a way to revert to the old style (e.g. >>> configure option)? Otherwise all my users have to add this option. >> >> There was a discussion in https://github.com/open-mpi/ompi/issues/2947 >> >> For now you can make use of >> https://www.open-mpi.org/faq/?category=tuning#setting-mca-params >> >> Essentially to have it set for all users automatically, put: >> >> plm_rsh_agent=foo >> >> in $prefix/etc/openmpi-mca-params.conf of your central Open MPI 2.1.0 >> installation. >> >> -- Reuti >> >> >>> Thanks again, and have a nice day >>> >>> Ado Arnolds >>> >>> On 22.03.2017 13:58, Reuti wrote: >>>> Hi, >>>> >>>>> Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds >>>>> <arno...@mpa-garching.mpg.de <mailto:arno...@mpa-garching.mpg.de>>: >>>>> >>>>> Dear users and developers, >>>>> >>>>> first of all many thanks for all the great work you have done for OpenMPI! >>>>> >>>>> Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh: >>>>> mpirun -np 8 --map-by ppr:4:node ./myid >>>>> /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V <DNS-Name of Remote >>>>> Machine> orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess >>>>> "env" -mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca >>>>> orte_ess_num_procs "2" -mca orte_hnp_uri "1621884928.0;tcp://<IP-addr of >>>>> Master>:41031" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" >>>>> --tree-spawn >>>>> >>>>> Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to >>>>> start orted: >>>>> mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 >>>>> ./myid >>>>> /usr/bin/ssh -x <DNS-Name of Remote Machine> >>>>> PATH=/afs/...../openmpi-2.1.0/bin:$PATH ; export PATH ; >>>>> LD_LIBRARY_PATH=/afs/...../openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export >>>>> LD_LIBRARY_PATH ; >>>>> DYLD_LIBRARY_PATH=/afs/...../openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; >>>>> export DYLD_LIBRARY_PATH ; /afs/...../openmpi-2.1.0/bin/orted >>>>> --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca >>>>> ess_base_jobid "1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs >>>>> "2" -mca orte_hnp_uri "1626013696.0;usock;tcp://<IP-addr of >>>>> Master>:43019" -mca plm_rsh_args "-x" -mca plm "rsh" -mca >>>>> rmaps_base_mapping_policy "ppr:4:node" -mca pmix "^s1,s2,cray" >>>>> >>>>> qrsh set the environment properly on the remote side, so that environment >>>>> variables from job scripts are properly transferred. With the ssh variant >>>>> the environment is not set properly on the remote side, and it seems that >>>>> there are handling problems with Kerberos tickets and/or AFS tokens. >>>>> >>>>> Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use >>>>> SGE/qrsh) one? Are there mca params to set this? >>>>> >>>>> If you need more info, please let me know. (Job submitting machine and >>>>> target cluster are the same with all tests. SW is residing in AFS >>>>> directories visible on all machines. Parameter "plm_rsh_disable_qrsh" >>>>> current value: "false") >>>> >>>> It looks like `mpirun` still needs: >>>> >>>> -mca plm_rsh_agent foo >>>> >>>> to allow SGE to be detected. >>>> >>>> -- Reuti >>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >>>> >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users