If you can launch that cmd line on each node, then it should work. I'm not sure what the "-x 10.0.35.43" at the beginning of the line is supposed to do, so that might not be useful.
On Mar 9, 2021, at 7:25 AM, Gabriel Tanase <gabrieltan...@gmail.com <mailto:gabrieltan...@gmail.com> > wrote: George, I started to digg more in option 2 as you describe it. I believe I can make that work. For example I created this fake ssh : $ cat ~/bin/ssh #!/bin/bash fname=env.$$ echo ">>>>>>>>>>>>> ssh" >> $fname env >>$fname echo ">>>>>>>>>>>>>>>>>>>>>>>>>>>" >>$fname echo $@ >>$fname And this one prints all args that the remote process will receive: >>>>>>>>>>>>>>>>>>>>>>>>>> -x 10.0.35.43 orted -mca ess "env" -mca ess_base_jobid "2752512000" -mca ess_base_vpid 1 -mca ess_base_num_procs "3" -mca orte_node_regex "ip-[2:10]-0-16-120,[2:10].0.35.43,[2:10].0.35.42@0(3)" -mca orte_hnp_uri "2752512000.0;tcp://10.0.16.120:44789 <http://10.0.16.120:44789/> " -mca plm "rsh" --tree-spawn -mca routed "radix" -mca orte_parent_uri "2752512000.0;tcp://10.0.16.120:44789 <http://10.0.16.120:44789/> " -mca rmaps_base_mapping_policy "node" -mca pmix "^s1,s2,cray,isolated" Now I am thinking that probably I don;t even need to create all those openmpi env variables as I am hoping orted that will be started remotely will start the final executable with the right env set. does this sound right ? Thx, --Gabriel On Fri, Mar 5, 2021 at 3:15 PM George Bosilca <bosi...@icl.utk.edu <mailto:bosi...@icl.utk.edu> > wrote: Gabriel, You should be able to. Here are at least 2 different ways of doing this. 1. Purely MPI. Start singletons (or smaller groups), and connect via sockets using MPI_Comm_join. You can setup your own DNS-like service, with the goal of having the independent MPI jobs leave a trace there, such that they can find each other and create the initial socket. 2. You could replace ssh/rsh with a no-op script (that returns success such that the mpirun process thinks it successfully started the processes), and then handcraft the environment as you did for GASNet. 3. We have support for DVM (Distributed Virtual Machine) that basically created an independent service where different mpirun could connect to retrieve information. The mpirun using this dvm singleton, and fallback to MPI_Comm_connect/accept to recreate an MPI world. Good luck, George. On Fri, Mar 5, 2021 at 2:08 PM Ralph Castain via devel <devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org> > wrote: I'm afraid that won't work - there is no way for the job to "self assemble". One could create a way to do it, but it would take some significant coding in the guts of OMPI to get there. On Mar 5, 2021, at 9:40 AM, Gabriel Tanase via devel <devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org> > wrote: Hi all, I decided to use mpi as the messaging layer for a multihost database. However within my org I faced very strong opposition to allow passwordless ssh or rsh. For security reasons we want to minimize the opportunities to execute arbitrary codes on the db clusters. I don;t want to run other things like slurm, etc. My question would be: Is there a way to start an mpi application by running certain binaries on each host ? E.g., if my executable is "myapp" can I start a server (orted???) on host zero and than start myapp on each host with the right env variables set (for specifying the rank, num ranks, etc.) For example when using another messaging API (GASnet) I was able to start a server on host zero and then manually start the application binary on each host (with some environment variables properly set) and all was good. I tried to reverse engineer a little the env variables used by mpirun (mpirun -np 2 env) and then I copied these env variables in a shell script prior to invoking my hello world mpirun but I got an error message implying a server is not present: PMIx_Init failed for the following reason: NOT-SUPPORTED Open MPI requires access to a local PMIx server to execute. Please ensure that either you are operating in a PMIx-enabled environment, or use "mpirun" to execute the job. Here is the shell script for host0: $ cat env1.sh #!/bin/bash export OMPI_COMM_WORLD_RANK=0 export PMIX_NAMESPACE=mpirun-38f9d3525c2c-53291@1 export PRTE_MCA_prte_base_help_aggregate=0 export TERM_PROGRAM=Apple_Terminal export OMPI_MCA_num_procs=2 export TERM=xterm-256color export SHELL=/bin/bash export PMIX_VERSION=4.1.0a1 export OPAL_USER_PARAMS_GIVEN=1 export TMPDIR=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T/ export Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.HCXmdRI1WL/Render export PMIX_SERVER_URI41=mpirun-38f9d3525c2c-53291@0.0;tcp4://192.168.0.180:52093 <http://192.168.0.180:52093/> export TERM_PROGRAM_VERSION=421.2 export PMIX_RANK=0 export TERM_SESSION_ID=18212D82-DEB2-4AE8-A271-FB47AC71337B export OMPI_COMM_WORLD_LOCAL_RANK=0 export OMPI_ARGV= export OMPI_MCA_initial_wdir=/Users/igtanase/ompi export USER=igtanase export OMPI_UNIVERSE_SIZE=2 export SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.PhcplcX3pC/Listeners export OMPI_COMMAND=./exe export __CF_USER_TEXT_ENCODING=0x54984577:0x0:0x0 export OMPI_FILE_LOCATION=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T//prte.38f9d3525c2c.1419265399/dvm.53291/1/0 export PMIX_SERVER_URI21=mpirun-38f9d3525c2c-53291@0.0;tcp4://192.168.0.180:52093 <http://192.168.0.180:52093/> export PATH=/Users/igtanase/ompi/bin/:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin export OMPI_COMM_WORLD_LOCAL_SIZE=2 export PRTE_MCA_pmix_session_server=1 export PWD=/Users/igtanase/ompi export OMPI_COMM_WORLD_SIZE=2 export OMPI_WORLD_SIZE=2 export LANG=en_US.UTF-8 export XPC_FLAGS=0x0 export PMIX_GDS_MODULE=hash export XPC_SERVICE_NAME=0 export HOME=/Users/igtanase export SHLVL=2 export PMIX_SECURITY_MODE=native export PMIX_HOSTNAME=38f9d3525c2c export LOGNAME=igtanase export OMPI_WORLD_LOCAL_SIZE=2 export PMIX_BFROP_BUFFER_TYPE=PMIX_BFROP_BUFFER_NON_DESC export PRTE_LAUNCHED=1 export PMIX_SERVER_TMPDIR=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T//prte.38f9d3525c2c.1419265399/dvm.53291 export OMPI_COMM_WORLD_NODE_RANK=0 export OMPI_MCA_cpu_type=x86_64 export PMIX_SYSTEM_TMPDIR=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T/ export PMIX_SERVER_URI4=mpirun-38f9d3525c2c-53291@0.0;tcp4://192.168.0.180:52093 <http://192.168.0.180:52093/> export OMPI_NUM_APP_CTX=1 export SECURITYSESSIONID=186a9 export PMIX_SERVER_URI3=mpirun-38f9d3525c2c-53291@0.0;tcp4://192.168.0.180:52093 <http://192.168.0.180:52093/> export PMIX_SERVER_URI2=mpirun-38f9d3525c2c-53291@0.0;tcp4://192.168.0.180:52093 <http://192.168.0.180:52093/> export _=/usr/bin/env ./exe Thx for your help, --Gabriel