Vincent,

Thanks for the details on the bug. Indeed this is a case that seems to have 
been a problem for a little while now when you use static ports with ORTE (-mca 
oob_tcp_static_ipv4_ports option). It must have crept in when we refactored the 
internal regular expression mechanism for the v4 branches (and now that I look 
maybe as far back as v3.1). I just hit this same issue in the past day or so 
working with a different user.

Though I do not have a suggestion for a workaround at this time (sorry) I did 
file a GitHub Issue and am looking at this issue. With the holiday I don't know 
when I will have a fix, but you can watch the ticket for updates.
  https://github.com/open-mpi/ompi/issues/8304

In the meantime, you could try the v3.0 series release (which predates this 
change) or the current Open MPI master branch (which approaches this a little 
differently). The same command line should work in both. Both can be downloaded 
from the links below:
  https://www.open-mpi.org/software/ompi/v3.0/
  https://www.open-mpi.org/nightly/master/


Regarding your command line, it looks pretty good:
  orterun --launch-agent /home/boubliki/openmpi/bin/orted -mca btl tcp --mca 
btl_tcp_port_min_v4 6706 --mca btl_tcp_port_range_v4 10 --mca 
oob_tcp_static_ipv4_ports 6705 -host node2:1 -np 1 /path/to/some/program arg1 
.. argn

I would suggest, while you are debugging this, that you use a program like 
/bin/hostname instead of a real MPI program. If /bin/hostname launches properly 
then move on to an MPI program. That will assure you that the runtime wired up 
correctly (oob/tcp), and then we can focus on the MPI side of the communication 
(btl/tcp). You will want to change "-mca btl tcp" to at least "-mca btl 
tcp,self" (or better "-mca btl tcp,vader,self" if you want shared memory). 
'self' is the loopback interface in Open MPI.

Is there a reason that you are specifying the --launch-agent to the orted? Is 
it installed in a different path on the remote nodes? If Open MPI is installed 
in the same location on all nodes then you shouldn't need that.


Thanks,
Josh



On Wed, Dec 16, 2020 at 9:23 AM Vincent Letocart via users 
<users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > wrote:
 
 
 Good morning
 
 I am facing a tuning problem while playing with the orterun command in order 
to set a tcp port within a specific range.
 A part of this can be I'm not very familiar with the architecture of the 
software and I sometimes struggle through the documentation.
 
 Here is what I'm trying to do (problem has been here reduced to launching a 
single task on ··one·· remote node):
 orterun --launch-agent /home/boubliki/openmpi/bin/orted -mca btl tcp --mca 
btl_tcp_port_min_v4 6706 --mca btl_tcp_port_range_v4 10 --mca 
oob_tcp_static_ipv4_ports 6705 -host node2:1 -np 1 /path/to/some/program arg1 
.. argn
 Those mca options are highlighted here and there in various mailing-lists or 
archives on the net. Version is 4.0.5.
 
 I tried different combinations like
 only --mca btl_tcp_port_min_v4 6706 --mca btl_tcp_port_range_v4 10 (then 
--report-uri shows a randomly picked up tcp port number)
 or adding --mca oob_tcp_static_ipv4_ports 6705  (then --report-uri report the 
tcp port I specified and everything crashes)
 or many others
 but the result becomes:
 [node2:4050181] *** Process received signal ***
 [node2:4050181] Signal: Segmentation fault (11)
 [node2:4050181] Signal code: Address not mapped (1)
 [node2:4050181] Failing at address: (nil)
 [node2:4050181] [ 0] /lib64/libpthread.so.0(+0x12dd0)[0x7fdaf95a9dd0]
 [node2:4050181] *** End of error message ***
 bash: line 1: 4050181 Segmentation fault      (core dumped) 
/home/boubliki/openmpi/bin/orted -mca ess "env" -mca ess_base_jobid 
"1254293504" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca 
orte_node_regex "node[1:1,2]@0(2)" -mca btl "tcp" --mca btl_tcp_port_min_v4 
"6706" --mca btl_tcp_port_range_v4 "10" --mca oob_tcp_static_ipv4_ports "6705" 
-mca plm "rsh" --tree-spawn -mca routed "radix" -mca orte_parent_uri 
"1254293504.0;tcp://192.168.xxx.xxx:6705" -mca orte_launch_agent 
"/home/boubliki/openmpi/bin/orted" -mca pmix "^s1,s2,cray,isolated"
 I tried on different machines, and also with different compilers (gcc 10.2 and 
intel 19u1). Version 4.1.0rc5 did not improve the execution. Forcing no 
optimization with -O0 neither.
 
 Not familiar with debugging such a software but I could add a lantency 
somewhere (sleep()) and catch the orted process on the [single] remote node, 
reaching line 572 with gdb
 
 boubliki@node1: ~/openmpi/src/openmpi-4.0.5> cat -n 
orte/mca/ess/base/ess_base_std_orted.c | sed -n -r -e '562,583p'
    562      if (orte_static_ports || orte_fwd_mpirun_port) {
    563          if (NULL == orte_node_regex) {
    564              /* we didn't get the node info */
    565              error = "cannot construct daemon map for static ports - no 
node map info";
    566              goto error;
    567          }
    568          /* extract the node info from the environment and
    569           * build a nidmap from it - this will update the
    570           * routing plan as well
    571           */
    572          if (ORTE_SUCCESS != (ret = orte_regx.build_daemon_nidmap())) {
    573              ORTE_ERROR_LOG(ret);
    574              error = "construct daemon map from static ports";
    575              goto error;
    576          }
    577          /* be sure to update the routing tree so the initial "phone 
home"
    578           * to mpirun goes through the tree if static ports were enabled
    579           */
    580          orte_routed.update_routing_plan(NULL);
    581          /* routing can be enabled */
    582          orte_routed_base.routing_enabled = true;
    583      }
 boubliki@node1: ~/openmpi/src/openmpi-4.0.5>
 
 The debugger led me to printing element called orte_regx, showing address of a 
method called build_daemon_nidmap containing a NULL value while line 572 wants 
precisely to call this method for execution.
 (gdb) 
 Thread 1 "orted" received signal SIGSEGV, Segmentation fault.
 0x0000000000000000 in ?? ()
 (gdb) bt
 #0  0x0000000000000000 in ?? ()
 #1  0x00007f76ae3fa585 in orte_ess_base_orted_setup () at 
base/ess_base_std_orted.c:572
 #2  0x00007f76ae2662b4 in rte_init () at ess_env_module.c:149
 #3  0x00007f76ae432645 in orte_init (pargc=pargc@entry=0x7ffe1c87a81c, 
pargv=pargv@entry=0x7ffe1c87a810, flags=flags@entry=2) at 
runtime/orte_init.c:271
 #4  0x00007f76ae3e0bf0 in orte_daemon (argc=<optimized out>, argv=<optimized 
out>) at orted/orted_main.c:362
 #5  0x00007f76acc976a3 in __libc_start_main () from /lib64/libc.so.6
 #6  0x000000000040111e in _start ()
 (gdb) p orte_regx
 $1 = {init = 0x0, nidmap_create = 0x7f76ab46c230 <nidmap_create>, nidmap_parse 
= 0x7f76ae4180b0 <orte_regx_base_nidmap_parse>, 
   extract_node_names = 0x7f76ae41bd20 <orte_regx_base_extract_node_names>, 
encode_nodemap = 0x7f76ae418730 <orte_regx_base_encode_nodemap>, 
   decode_daemon_nodemap = 0x7f76ae41a190 
<orte_regx_base_decode_daemon_nodemap>, build_daemon_nidmap = 0x0, 
   generate_ppn = 0x7f76ae41b0f0 <orte_regx_base_generate_ppn>, parse_ppn = 
0x7f76ae41b760 <orte_regx_base_parse_ppn>, finalize = 0x0}

 (gdb)
 I suppose the orte_regx element has been initialized somewhere through an 
inline function in [maybe] opal/class/opal_object.h but I'm lost in code and 
probably in some concurrency/multi-threading aspects, and can't even figure out 
at the end whether I'm using the mca option correctly or not, or facing a bug 
in the core application
 
     478 static inline opal_object_t *opal_obj_new(opal_class_t * cls)
     479 {
     480     opal_object_t *object;
     481     assert(cls->cls_sizeof >= sizeof(opal_object_t));
     482 
     483 #if OPAL_WANT_MEMCHECKER
     484     object = (opal_object_t *) calloc(1, cls->cls_sizeof);
     485 #else
     486     object = (opal_object_t *) malloc(cls->cls_sizeof);
     487 #endif
     488     if (opal_class_init_epoch != cls->cls_initialized) {
     489         opal_class_initialize(cls);
     490     }
     491     if (NULL != object) {
     492         object->obj_class = cls;
     493         object->obj_reference_count = 1;
     494         opal_obj_run_constructors(object);
     495     }
     496     return object;
     497 }
 
 Can you maybe (firstly) fix my knowledge about what correct mca option I could 
use for this and get orted on the remote node connecting back to the tcp port I 
specify ?
 Or (worse) browse the code for a potential bug related to this functionality ?
 
 Thank you
 
 Vincent
 
 
 


-- 
Josh Hursey
IBM Spectrum MPI Developer

Reply via email to