Hi folks

There are a few things I’d like to cover on Tuesday’s call:

* review of detailed launch timings - I’m seeing linear scaling vs ppn for the 
initialization code at the very beginning of MPI_Init. This consists of the 
following calls:
        ompi_hook_base_mpi_init_top
        ompi_mpi_thread_level
        opal_init_util
        ompi_register_mca_variables
        opal_arch_set_fortran_logical_size
        ompi_hook_base_mpi_init_top_post_opal

This turns out to now be the single largest time component in our startup, so 
I’d like to understand what is scale-dependent in that list, and why.

* when I disable all but shared memory BTLs, MPI_Init errors out with the 
message that procs on different nodes have no way to connect to each other. 
However, we are supposedly not retrieving modex information until first 
message, and the app I’m running is a simple MPI_Init/MPI_finalize - and so 
there is no communication. Why then do I error out during init? How does the 
system “know” that the procs cannot communicate?

* discuss Artem’s question of behavior: 
https://github.com/open-mpi/ompi/issues/3269 
<https://github.com/open-mpi/ompi/issues/3269>

Thanks
Ralph

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to