Vincent, Thanks for the details on the bug. Indeed this is a case that seems to have been a problem for a little while now when you use static ports with ORTE (-mca oob_tcp_static_ipv4_ports option). It must have crept in when we refactored the internal regular expression mechanism for the v4 branches (and now that I look maybe as far back as v3.1). I just hit this same issue in the past day or so working with a different user.
Though I do not have a suggestion for a workaround at this time (sorry) I did file a GitHub Issue and am looking at this issue. With the holiday I don't know when I will have a fix, but you can watch the ticket for updates. https://github.com/open-mpi/ompi/issues/8304 In the meantime, you could try the v3.0 series release (which predates this change) or the current Open MPI master branch (which approaches this a little differently). The same command line should work in both. Both can be downloaded from the links below: https://www.open-mpi.org/software/ompi/v3.0/ https://www.open-mpi.org/nightly/master/ Regarding your command line, it looks pretty good: orterun --launch-agent /home/boubliki/openmpi/bin/orted -mca btl tcp --mca btl_tcp_port_min_v4 6706 --mca btl_tcp_port_range_v4 10 --mca oob_tcp_static_ipv4_ports 6705 -host node2:1 -np 1 /path/to/some/program arg1 .. argn I would suggest, while you are debugging this, that you use a program like /bin/hostname instead of a real MPI program. If /bin/hostname launches properly then move on to an MPI program. That will assure you that the runtime wired up correctly (oob/tcp), and then we can focus on the MPI side of the communication (btl/tcp). You will want to change "-mca btl tcp" to at least "-mca btl tcp,self" (or better "-mca btl tcp,vader,self" if you want shared memory). 'self' is the loopback interface in Open MPI. Is there a reason that you are specifying the --launch-agent to the orted? Is it installed in a different path on the remote nodes? If Open MPI is installed in the same location on all nodes then you shouldn't need that. Thanks, Josh On Wed, Dec 16, 2020 at 9:23 AM Vincent Letocart via users <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > wrote: Good morning I am facing a tuning problem while playing with the orterun command in order to set a tcp port within a specific range. A part of this can be I'm not very familiar with the architecture of the software and I sometimes struggle through the documentation. Here is what I'm trying to do (problem has been here reduced to launching a single task on ··one·· remote node): orterun --launch-agent /home/boubliki/openmpi/bin/orted -mca btl tcp --mca btl_tcp_port_min_v4 6706 --mca btl_tcp_port_range_v4 10 --mca oob_tcp_static_ipv4_ports 6705 -host node2:1 -np 1 /path/to/some/program arg1 .. argn Those mca options are highlighted here and there in various mailing-lists or archives on the net. Version is 4.0.5. I tried different combinations like only --mca btl_tcp_port_min_v4 6706 --mca btl_tcp_port_range_v4 10 (then --report-uri shows a randomly picked up tcp port number) or adding --mca oob_tcp_static_ipv4_ports 6705 (then --report-uri report the tcp port I specified and everything crashes) or many others but the result becomes: [node2:4050181] *** Process received signal *** [node2:4050181] Signal: Segmentation fault (11) [node2:4050181] Signal code: Address not mapped (1) [node2:4050181] Failing at address: (nil) [node2:4050181] [ 0] /lib64/libpthread.so.0(+0x12dd0)[0x7fdaf95a9dd0] [node2:4050181] *** End of error message *** bash: line 1: 4050181 Segmentation fault (core dumped) /home/boubliki/openmpi/bin/orted -mca ess "env" -mca ess_base_jobid "1254293504" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca orte_node_regex "node[1:1,2]@0(2)" -mca btl "tcp" --mca btl_tcp_port_min_v4 "6706" --mca btl_tcp_port_range_v4 "10" --mca oob_tcp_static_ipv4_ports "6705" -mca plm "rsh" --tree-spawn -mca routed "radix" -mca orte_parent_uri "1254293504.0;tcp://192.168.xxx.xxx:6705" -mca orte_launch_agent "/home/boubliki/openmpi/bin/orted" -mca pmix "^s1,s2,cray,isolated" I tried on different machines, and also with different compilers (gcc 10.2 and intel 19u1). Version 4.1.0rc5 did not improve the execution. Forcing no optimization with -O0 neither. Not familiar with debugging such a software but I could add a lantency somewhere (sleep()) and catch the orted process on the [single] remote node, reaching line 572 with gdb boubliki@node1: ~/openmpi/src/openmpi-4.0.5> cat -n orte/mca/ess/base/ess_base_std_orted.c | sed -n -r -e '562,583p' 562 if (orte_static_ports || orte_fwd_mpirun_port) { 563 if (NULL == orte_node_regex) { 564 /* we didn't get the node info */ 565 error = "cannot construct daemon map for static ports - no node map info"; 566 goto error; 567 } 568 /* extract the node info from the environment and 569 * build a nidmap from it - this will update the 570 * routing plan as well 571 */ 572 if (ORTE_SUCCESS != (ret = orte_regx.build_daemon_nidmap())) { 573 ORTE_ERROR_LOG(ret); 574 error = "construct daemon map from static ports"; 575 goto error; 576 } 577 /* be sure to update the routing tree so the initial "phone home" 578 * to mpirun goes through the tree if static ports were enabled 579 */ 580 orte_routed.update_routing_plan(NULL); 581 /* routing can be enabled */ 582 orte_routed_base.routing_enabled = true; 583 } boubliki@node1: ~/openmpi/src/openmpi-4.0.5> The debugger led me to printing element called orte_regx, showing address of a method called build_daemon_nidmap containing a NULL value while line 572 wants precisely to call this method for execution. (gdb) Thread 1 "orted" received signal SIGSEGV, Segmentation fault. 0x0000000000000000 in ?? () (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00007f76ae3fa585 in orte_ess_base_orted_setup () at base/ess_base_std_orted.c:572 #2 0x00007f76ae2662b4 in rte_init () at ess_env_module.c:149 #3 0x00007f76ae432645 in orte_init (pargc=pargc@entry=0x7ffe1c87a81c, pargv=pargv@entry=0x7ffe1c87a810, flags=flags@entry=2) at runtime/orte_init.c:271 #4 0x00007f76ae3e0bf0 in orte_daemon (argc=<optimized out>, argv=<optimized out>) at orted/orted_main.c:362 #5 0x00007f76acc976a3 in __libc_start_main () from /lib64/libc.so.6 #6 0x000000000040111e in _start () (gdb) p orte_regx $1 = {init = 0x0, nidmap_create = 0x7f76ab46c230 <nidmap_create>, nidmap_parse = 0x7f76ae4180b0 <orte_regx_base_nidmap_parse>, extract_node_names = 0x7f76ae41bd20 <orte_regx_base_extract_node_names>, encode_nodemap = 0x7f76ae418730 <orte_regx_base_encode_nodemap>, decode_daemon_nodemap = 0x7f76ae41a190 <orte_regx_base_decode_daemon_nodemap>, build_daemon_nidmap = 0x0, generate_ppn = 0x7f76ae41b0f0 <orte_regx_base_generate_ppn>, parse_ppn = 0x7f76ae41b760 <orte_regx_base_parse_ppn>, finalize = 0x0} (gdb) I suppose the orte_regx element has been initialized somewhere through an inline function in [maybe] opal/class/opal_object.h but I'm lost in code and probably in some concurrency/multi-threading aspects, and can't even figure out at the end whether I'm using the mca option correctly or not, or facing a bug in the core application 478 static inline opal_object_t *opal_obj_new(opal_class_t * cls) 479 { 480 opal_object_t *object; 481 assert(cls->cls_sizeof >= sizeof(opal_object_t)); 482 483 #if OPAL_WANT_MEMCHECKER 484 object = (opal_object_t *) calloc(1, cls->cls_sizeof); 485 #else 486 object = (opal_object_t *) malloc(cls->cls_sizeof); 487 #endif 488 if (opal_class_init_epoch != cls->cls_initialized) { 489 opal_class_initialize(cls); 490 } 491 if (NULL != object) { 492 object->obj_class = cls; 493 object->obj_reference_count = 1; 494 opal_obj_run_constructors(object); 495 } 496 return object; 497 } Can you maybe (firstly) fix my knowledge about what correct mca option I could use for this and get orted on the remote node connecting back to the tcp port I specify ? Or (worse) browse the code for a potential bug related to this functionality ? Thank you Vincent -- Josh Hursey IBM Spectrum MPI Developer