Hello all,

It was long ago where I've asked about hints to implement a dynamic BTL control. I've currently managed to change the MPI communication path from a BTL module (e.g. openib) to another BTL module (e.g. tcp) at runtime of a distributed application.

For this I've developed a so called BTL Control Client (orte-btlctl) to send control messages to all processes through the ORTE RML. These messages are received and processed in the OMPI BML. In BML I've implemented a function to stop the MPI communication and another for changing the BTL exclusivity and recalculating the btl_{send,eager,rdma} lists. All is done at runtime so a distributed application running with Open MPI is not affected in its computation.

I also managed to unload a module not used anymore, e.g. openib after changing the MPI communication to tcp, through the already implemented function mca_bml_r2_del_btl(mca_btl_base_module_t* btl).

The Question:
The function to (re)initialise a BTL module "mca_bml_r2_add_btl(mca_btl_base_module_t* btl)" is currently not implemented. Why is it not implemented? And what has to be done if I want to implement it?

As far as I understood the internals of the OMPI Layer, for adding a BTL module you have to implement the following steps:
1. find the corresponding component in mca_btl_base_components_opened
2. Do component->btl_init to get an array of BTL modules
3. and add those to mca_btl_base_modules_initialized
4. Iterate through mca_btl_base_modules_initialized and add BTL module to mca_bml_r2.btl_modules in bml_r2 5. Add BTL module to btl_{send,eager,rdma} (if applicable) for all reachable procs

Am I missing something?

The Background:
I should give some background, why I'm implementing this. Changing the MPI communication from a high speed network to a network with flowcontrol (openib->tcp) is necessary for checkpointing distributed applications in virtual machines. Ok, you are able to checkpoint through the FT-Framework and BLCR in Open MPI, but virtual machines already provide trivial functions for checkpointing. As you are not able to checkpoint the hardware information of e.g. openib you have to get rid of it in case of a checkpoint, and change back again on resume/continue.

Would such feature/support generally be interesting for you? The implementation will be made publicly available on bitbucket until end of march.

Thoughts? Suggestions? Or hints? :)
Thanks a lot,

Christoph Konersmann

--
Paderborn Center for Parallel Computing - PC2
University of Paderborn - Germany
http://www.pc2.de

Christoph Konersmann <c...@upb.de>

Reply via email to