Hi folks There are a few things I’d like to cover on Tuesday’s call:
* review of detailed launch timings - I’m seeing linear scaling vs ppn for the initialization code at the very beginning of MPI_Init. This consists of the following calls: ompi_hook_base_mpi_init_top ompi_mpi_thread_level opal_init_util ompi_register_mca_variables opal_arch_set_fortran_logical_size ompi_hook_base_mpi_init_top_post_opal This turns out to now be the single largest time component in our startup, so I’d like to understand what is scale-dependent in that list, and why. * when I disable all but shared memory BTLs, MPI_Init errors out with the message that procs on different nodes have no way to connect to each other. However, we are supposedly not retrieving modex information until first message, and the app I’m running is a simple MPI_Init/MPI_finalize - and so there is no communication. Why then do I error out during init? How does the system “know” that the procs cannot communicate? * discuss Artem’s question of behavior: https://github.com/open-mpi/ompi/issues/3269 <https://github.com/open-mpi/ompi/issues/3269> Thanks Ralph
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel