[OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Heinz-Ado Arnolds
Dear users and developers,

first of all many thanks for all the great work you have done for OpenMPI!

Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh:
  mpirun -np 8 --map-by ppr:4:node ./myid
  /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V  orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" 
-mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca orte_ess_num_procs 
"2" -mca orte_hnp_uri "1621884928.0;tcp://:41031" -mca plm 
"rsh" -mca rmaps_base_mapping_policy "ppr:4:node" --tree-spawn

Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to start 
orted:
  mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 
./myid
  /usr/bin/ssh -x  
PATH=/afs/./openmpi-2.1.0/bin:$PATH ; export PATH ; 
LD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export 
LD_LIBRARY_PATH ; 
DYLD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; export 
DYLD_LIBRARY_PATH ;   /afs/./openmpi-2.1.0/bin/orted --hnp-topo-sig 
2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca ess_base_jobid 
"1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca orte_hnp_uri 
"1626013696.0;usock;tcp://:43019" -mca plm_rsh_args "-x" 
-mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" -mca pmix 
"^s1,s2,cray"

qrsh set the environment properly on the remote side, so that environment 
variables from job scripts are properly transferred. With the ssh variant the 
environment is not set properly on the remote side, and it seems that there are 
handling problems with Kerberos tickets and/or AFS tokens.

Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use SGE/qrsh) one? 
Are there mca params to set this?

If you need more info, please let me know. (Job submitting machine and target 
cluster are the same with all tests. SW is residing in AFS directories visible 
on all machines. Parameter "plm_rsh_disable_qrsh" current value: "false")

Kind regards,

Heinz-Ado Arnolds




smime.p7s
Description: S/MIME Cryptographic Signature
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Reuti
Hi,

> Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds 
> :
> 
> Dear users and developers,
> 
> first of all many thanks for all the great work you have done for OpenMPI!
> 
> Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh:
>  mpirun -np 8 --map-by ppr:4:node ./myid
>  /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V  Machine> orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess 
> "env" -mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca 
> orte_ess_num_procs "2" -mca orte_hnp_uri "1621884928.0;tcp:// Master>:41031" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
> --tree-spawn
> 
> Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to start 
> orted:
>  mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 
> ./myid
>  /usr/bin/ssh -x  
> PATH=/afs/./openmpi-2.1.0/bin:$PATH ; export PATH ; 
> LD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export 
> LD_LIBRARY_PATH ; 
> DYLD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; export 
> DYLD_LIBRARY_PATH ;   /afs/./openmpi-2.1.0/bin/orted --hnp-topo-sig 
> 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca ess_base_jobid 
> "1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca 
> orte_hnp_uri "1626013696.0;usock;tcp://:43019" -mca 
> plm_rsh_args "-x" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
> -mca pmix "^s1,s2,cray"
> 
> qrsh set the environment properly on the remote side, so that environment 
> variables from job scripts are properly transferred. With the ssh variant the 
> environment is not set properly on the remote side, and it seems that there 
> are handling problems with Kerberos tickets and/or AFS tokens.
> 
> Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use SGE/qrsh) 
> one? Are there mca params to set this?
> 
> If you need more info, please let me know. (Job submitting machine and target 
> cluster are the same with all tests. SW is residing in AFS directories 
> visible on all machines. Parameter "plm_rsh_disable_qrsh" current value: 
> "false")

It looks like `mpirun` still needs:

-mca plm_rsh_agent foo

to allow SGE to be detected.

-- Reuti



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Heinz-Ado Arnolds
Dear Reuti,

thanks a lot, you're right! But why did the default behavior change but not the 
value of this parameter:

2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", data 
source: default, level: 2 user/detail, type: string, synonyms: pls_rsh_agent, 
orte_rsh_agent)
  The command used to launch executables on remote 
nodes (typically either "ssh" or "rsh")

1.10.6:  MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", data 
source: default, level: 2 user/detail, type: string, synonyms: pls_rsh_agent, 
orte_rsh_agent)
  The command used to launch executables on remote 
nodes (typically either "ssh" or "rsh")

That means there must have been changes in the code regarding that, perhaps for 
detecting SGE? Do you know of a way to revert to the old style (e.g. configure 
option)? Otherwise all my users have to add this option.

Thanks again, and have a nice day

Ado Arnolds

On 22.03.2017 13:58, Reuti wrote:
> Hi,
> 
>> Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds 
>> :
>>
>> Dear users and developers,
>>
>> first of all many thanks for all the great work you have done for OpenMPI!
>>
>> Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh:
>>  mpirun -np 8 --map-by ppr:4:node ./myid
>>  /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V > Machine> orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess 
>> "env" -mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca 
>> orte_ess_num_procs "2" -mca orte_hnp_uri "1621884928.0;tcp://> Master>:41031" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
>> --tree-spawn
>>
>> Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to start 
>> orted:
>>  mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 
>> ./myid
>>  /usr/bin/ssh -x  
>> PATH=/afs/./openmpi-2.1.0/bin:$PATH ; export PATH ; 
>> LD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export 
>> LD_LIBRARY_PATH ; 
>> DYLD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; export 
>> DYLD_LIBRARY_PATH ;   /afs/./openmpi-2.1.0/bin/orted --hnp-topo-sig 
>> 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca ess_base_jobid 
>> "1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca 
>> orte_hnp_uri "1626013696.0;usock;tcp://:43019" -mca 
>> plm_rsh_args "-x" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
>> -mca pmix "^s1,s2,cray"
>>
>> qrsh set the environment properly on the remote side, so that environment 
>> variables from job scripts are properly transferred. With the ssh variant 
>> the environment is not set properly on the remote side, and it seems that 
>> there are handling problems with Kerberos tickets and/or AFS tokens.
>>
>> Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use SGE/qrsh) 
>> one? Are there mca params to set this?
>>
>> If you need more info, please let me know. (Job submitting machine and 
>> target cluster are the same with all tests. SW is residing in AFS 
>> directories visible on all machines. Parameter "plm_rsh_disable_qrsh" 
>> current value: "false")
> 
> It looks like `mpirun` still needs:
> 
> -mca plm_rsh_agent foo
> 
> to allow SGE to be detected.
> 
> -- Reuti
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread Reuti

> Am 22.03.2017 um 15:31 schrieb Heinz-Ado Arnolds 
> :
> 
> Dear Reuti,
> 
> thanks a lot, you're right! But why did the default behavior change but not 
> the value of this parameter:
> 
> 2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", 
> data source: default, level: 2 user/detail, type: string, synonyms: 
> pls_rsh_agent, orte_rsh_agent)
>  The command used to launch executables on remote 
> nodes (typically either "ssh" or "rsh")
> 
> 1.10.6:  MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", data 
> source: default, level: 2 user/detail, type: string, synonyms: pls_rsh_agent, 
> orte_rsh_agent)
>  The command used to launch executables on remote 
> nodes (typically either "ssh" or "rsh")
> 
> That means there must have been changes in the code regarding that, perhaps 
> for detecting SGE? Do you know of a way to revert to the old style (e.g. 
> configure option)? Otherwise all my users have to add this option.

There was a discussion in https://github.com/open-mpi/ompi/issues/2947

For now you can make use of 
https://www.open-mpi.org/faq/?category=tuning#setting-mca-params

Essentially to have it set for all users automatically, put:

plm_rsh_agent=foo

in $prefix/etc/openmpi-mca-params.conf of your central Open MPI 2.1.0 
installation.

-- Reuti


> Thanks again, and have a nice day
> 
> Ado Arnolds
> 
> On 22.03.2017 13:58, Reuti wrote:
>> Hi,
>> 
>>> Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds 
>>> :
>>> 
>>> Dear users and developers,
>>> 
>>> first of all many thanks for all the great work you have done for OpenMPI!
>>> 
>>> Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh:
>>> mpirun -np 8 --map-by ppr:4:node ./myid
>>> /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V >> Machine> orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess 
>>> "env" -mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca 
>>> orte_ess_num_procs "2" -mca orte_hnp_uri "1621884928.0;tcp://>> Master>:41031" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
>>> --tree-spawn
>>> 
>>> Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to start 
>>> orted:
>>> mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 
>>> ./myid
>>> /usr/bin/ssh -x  
>>> PATH=/afs/./openmpi-2.1.0/bin:$PATH ; export PATH ; 
>>> LD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export 
>>> LD_LIBRARY_PATH ; 
>>> DYLD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; export 
>>> DYLD_LIBRARY_PATH ;   /afs/./openmpi-2.1.0/bin/orted --hnp-topo-sig 
>>> 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca ess_base_jobid 
>>> "1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca 
>>> orte_hnp_uri "1626013696.0;usock;tcp://:43019" -mca 
>>> plm_rsh_args "-x" -mca plm "rsh" -mca rmaps_base_mapping_policy 
>>> "ppr:4:node" -mca pmix "^s1,s2,cray"
>>> 
>>> qrsh set the environment properly on the remote side, so that environment 
>>> variables from job scripts are properly transferred. With the ssh variant 
>>> the environment is not set properly on the remote side, and it seems that 
>>> there are handling problems with Kerberos tickets and/or AFS tokens.
>>> 
>>> Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use SGE/qrsh) 
>>> one? Are there mca params to set this?
>>> 
>>> If you need more info, please let me know. (Job submitting machine and 
>>> target cluster are the same with all tests. SW is residing in AFS 
>>> directories visible on all machines. Parameter "plm_rsh_disable_qrsh" 
>>> current value: "false")
>> 
>> It looks like `mpirun` still needs:
>> 
>> -mca plm_rsh_agent foo
>> 
>> to allow SGE to be detected.
>> 
>> -- Reuti
>> 
>> 
>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-22 Thread r...@open-mpi.org
Sorry folks - for some reason (probably timing for getting 2.1.0 out), the fix 
for this got pushed to v2.1.1 - see the PR here: 
https://github.com/open-mpi/ompi/pull/3163 



> On Mar 22, 2017, at 7:49 AM, Reuti  wrote:
> 
>> 
>> Am 22.03.2017 um 15:31 schrieb Heinz-Ado Arnolds 
>> mailto:arno...@mpa-garching.mpg.de>>:
>> 
>> Dear Reuti,
>> 
>> thanks a lot, you're right! But why did the default behavior change but not 
>> the value of this parameter:
>> 
>> 2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", 
>> data source: default, level: 2 user/detail, type: string, synonyms: 
>> pls_rsh_agent, orte_rsh_agent)
>> The command used to launch executables on remote 
>> nodes (typically either "ssh" or "rsh")
>> 
>> 1.10.6:  MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", 
>> data source: default, level: 2 user/detail, type: string, synonyms: 
>> pls_rsh_agent, orte_rsh_agent)
>> The command used to launch executables on remote 
>> nodes (typically either "ssh" or "rsh")
>> 
>> That means there must have been changes in the code regarding that, perhaps 
>> for detecting SGE? Do you know of a way to revert to the old style (e.g. 
>> configure option)? Otherwise all my users have to add this option.
> 
> There was a discussion in https://github.com/open-mpi/ompi/issues/2947 
> 
> 
> For now you can make use of 
> https://www.open-mpi.org/faq/?category=tuning#setting-mca-params 
> 
> 
> Essentially to have it set for all users automatically, put:
> 
> plm_rsh_agent=foo
> 
> in $prefix/etc/openmpi-mca-params.conf of your central Open MPI 2.1.0 
> installation.
> 
> -- Reuti
> 
> 
>> Thanks again, and have a nice day
>> 
>> Ado Arnolds
>> 
>> On 22.03.2017 13:58, Reuti wrote:
>>> Hi,
>>> 
 Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds 
 :
 
 Dear users and developers,
 
 first of all many thanks for all the great work you have done for OpenMPI!
 
 Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh:
 mpirun -np 8 --map-by ppr:4:node ./myid
 /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V >>> Machine> orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess 
 "env" -mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca 
 orte_ess_num_procs "2" -mca orte_hnp_uri "1621884928.0;tcp://>>> Master>:41031" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
 --tree-spawn
 
 Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to start 
 orted:
 mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 
 ./myid
 /usr/bin/ssh -x  
 PATH=/afs/./openmpi-2.1.0/bin:$PATH ; export PATH ; 
 LD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export 
 LD_LIBRARY_PATH ; 
 DYLD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; export 
 DYLD_LIBRARY_PATH ;   /afs/./openmpi-2.1.0/bin/orted --hnp-topo-sig 
 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca ess_base_jobid 
 "1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca 
 orte_hnp_uri "1626013696.0;usock;tcp://:43019" -mca 
 plm_rsh_args "-x" -mca plm "rsh" -mca rmaps_base_mapping_policy 
 "ppr:4:node" -mca pmix "^s1,s2,cray"
 
 qrsh set the environment properly on the remote side, so that environment 
 variables from job scripts are properly transferred. With the ssh variant 
 the environment is not set properly on the remote side, and it seems that 
 there are handling problems with Kerberos tickets and/or AFS tokens.
 
 Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use SGE/qrsh) 
 one? Are there mca params to set this?
 
 If you need more info, please let me know. (Job submitting machine and 
 target cluster are the same with all tests. SW is residing in AFS 
 directories visible on all machines. Parameter "plm_rsh_disable_qrsh" 
 current value: "false")
>>> 
>>> It looks like `mpirun` still needs:
>>> 
>>> -mca plm_rsh_agent foo
>>> 
>>> to allow SGE to be detected.
>>> 
>>> -- Reuti
>>> 
>>> 
>>> 
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>> 
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> 

Re: [OMPI users] OpenMPI-2.1.0 problem with executing orted when using SGE

2017-03-27 Thread Heinz-Ado Arnolds
Dear rhc,
dear Reuti,

thanks for your valuable help!

Kind regards,

Ado Arnolds

On 22.03.2017 15:55, r...@open-mpi.org wrote:
> Sorry folks - for some reason (probably timing for getting 2.1.0 out), the 
> fix for this got pushed to v2.1.1 - see the PR here: 
> https://github.com/open-mpi/ompi/pull/3163
> 
> 
>> On Mar 22, 2017, at 7:49 AM, Reuti > > wrote:
>>
>>>
>>> Am 22.03.2017 um 15:31 schrieb Heinz-Ado Arnolds 
>>> mailto:arno...@mpa-garching.mpg.de>>:
>>>
>>> Dear Reuti,
>>>
>>> thanks a lot, you're right! But why did the default behavior change but not 
>>> the value of this parameter:
>>>
>>> 2.1.0: MCA plm rsh: parameter "plm_rsh_agent" (current value: "ssh : rsh", 
>>> data source: default, level: 2 user/detail, type: string, synonyms: 
>>> pls_rsh_agent, orte_rsh_agent)
>>> The command used to launch executables on remote 
>>> nodes (typically either "ssh" or "rsh")
>>>
>>> 1.10.6:  MCA plm: parameter "plm_rsh_agent" (current value: "ssh : rsh", 
>>> data source: default, level: 2 user/detail, type: string, synonyms: 
>>> pls_rsh_agent, orte_rsh_agent)
>>> The command used to launch executables on remote 
>>> nodes (typically either "ssh" or "rsh")
>>>
>>> That means there must have been changes in the code regarding that, perhaps 
>>> for detecting SGE? Do you know of a way to revert to the old style (e.g. 
>>> configure option)? Otherwise all my users have to add this option.
>>
>> There was a discussion in https://github.com/open-mpi/ompi/issues/2947
>>
>> For now you can make use of 
>> https://www.open-mpi.org/faq/?category=tuning#setting-mca-params
>>
>> Essentially to have it set for all users automatically, put:
>>
>> plm_rsh_agent=foo
>>
>> in $prefix/etc/openmpi-mca-params.conf of your central Open MPI 2.1.0 
>> installation.
>>
>> -- Reuti
>>
>>
>>> Thanks again, and have a nice day
>>>
>>> Ado Arnolds
>>>
>>> On 22.03.2017 13:58, Reuti wrote:
 Hi,

> Am 22.03.2017 um 10:44 schrieb Heinz-Ado Arnolds 
> mailto:arno...@mpa-garching.mpg.de>>:
>
> Dear users and developers,
>
> first of all many thanks for all the great work you have done for OpenMPI!
>
> Up to OpenMPI-1.10.6 the mechanism for starting orted was to use SGE/qrsh:
> mpirun -np 8 --map-by ppr:4:node ./myid
> /opt/sge-8.1.8/bin/lx-amd64/qrsh -inherit -nostdin -V  Machine> orted --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess 
> "env" -mca orte_ess_jobid "1621884928" -mca orte_ess_vpid 1 -mca 
> orte_ess_num_procs "2" -mca orte_hnp_uri "1621884928.0;tcp:// Master>:41031" -mca plm "rsh" -mca rmaps_base_mapping_policy "ppr:4:node" 
> --tree-spawn
>
> Now with OpenMPI-2.1.0 (and the release candidates) "ssh" is used to 
> start orted:
> mpirun -np 8 --map-by ppr:4:node -mca mca_base_env_list OMP_NUM_THREADS=5 
> ./myid
> /usr/bin/ssh -x  
> PATH=/afs/./openmpi-2.1.0/bin:$PATH ; export PATH ; 
> LD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$LD_LIBRARY_PATH ; export 
> LD_LIBRARY_PATH ; 
> DYLD_LIBRARY_PATH=/afs/./openmpi-2.1.0/lib:$DYLD_LIBRARY_PATH ; 
> export DYLD_LIBRARY_PATH ;   /afs/./openmpi-2.1.0/bin/orted 
> --hnp-topo-sig 2N:2S:2L3:20L2:20L1:20C:40H:x86_64 -mca ess "env" -mca 
> ess_base_jobid "1626013696" -mca ess_base_vpid 1 -mca ess_base_num_procs 
> "2" -mca orte_hnp_uri "1626013696.0;usock;tcp:// Master>:43019" -mca plm_rsh_args "-x" -mca plm "rsh" -mca 
> rmaps_base_mapping_policy "ppr:4:node" -mca pmix "^s1,s2,cray"
>
> qrsh set the environment properly on the remote side, so that environment 
> variables from job scripts are properly transferred. With the ssh variant 
> the environment is not set properly on the remote side, and it seems that 
> there are handling problems with Kerberos tickets and/or AFS tokens.
>
> Is there any way to revert the 2.1.0 behavior to the 1.10.6 (use 
> SGE/qrsh) one? Are there mca params to set this?
>
> If you need more info, please let me know. (Job submitting machine and 
> target cluster are the same with all tests. SW is residing in AFS 
> directories visible on all machines. Parameter "plm_rsh_disable_qrsh" 
> current value: "false")

 It looks like `mpirun` still needs:

 -mca plm_rsh_agent foo

 to allow SGE to be detected.

 -- Reuti



 ___
 users mailing list
 users@lists.open-mpi.org 
 https://rfd.newmexicoconsortium.org/mailman/listinfo/users

>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org 
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.