Hello, I am trying to replicate a simple client/server MPI application using MPI_Comm_accept and MPI_Comm_connect . Before version 5.0.x, I used the ompi-server command to allow the communication between the two processes, but I don't see this command anymore in the new 5.0.x release. Without running the ompi-server, I cannot publish anymore the port on which the server accepts connection; a minimal example below.
Moreover, even if I communicate the server port to the client in other ways (such as printing on a file), the two processes hang; in previous versions, I would get an error asking to run the ompi-server and communicate its address to the environment. server.c #include <mpi.h> #include <stdio.h> int main(int argc, char **argv ) { MPI_Comm client; char port_name[MPI_MAX_PORT_NAME]; int size; MPI_Info info; MPI_Init( &argc, &argv ); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Open_port(MPI_INFO_NULL, port_name); printf("Server available at %s\n", port_name); MPI_Info_create(&info); MPI_Publish_name("name", info, port_name); printf("Wait for client connection\n"); MPI_Comm_accept( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &client ); printf("Client connected\n"); MPI_Unpublish_name("name", MPI_INFO_NULL, port_name); MPI_Comm_free( &client ); MPI_Close_port(port_name); MPI_Finalize(); return 0; } client.c #include <mpi.h> #include <stdio.h> int main(int argc, char **argv ) { MPI_Comm server; char port_name[MPI_MAX_PORT_NAME]; MPI_Init( &argc, &argv ); printf("Looking for server\n"); MPI_Lookup_name( "name", MPI_INFO_NULL, port_name); printf("server found at %s\n", port_name); printf("Wait for server connection\n"); MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server ); printf("Server connected\n"); MPI_Comm_disconnect( &server ); MPI_Finalize(); return 0; } Error message due to the lack of a ompi-server where to publish the port name [parallels-Parallels-Virtual-Platform:61301] mca_base_component_repository_open: unable to open mca_reachable_netlink: libopen-pal.so.40: cannot open shared object file: No such file or directory (ignored) [parallels-Parallels-Virtual-Platform:61301] mca_base_component_repository_open: unable to open mca_btl_openib: libopen-pal.so.40: cannot open shared object file: No such file or directory (ignored) Looking for server [parallels-Parallels-Virtual-Platform:00000] *** An error occurred in MPI_Lookup_name [parallels-Parallels-Virtual-Platform:00000] *** reported by process [611254273,0] [parallels-Parallels-Virtual-Platform:00000] *** on communicator MPI_COMM_SELF [parallels-Parallels-Virtual-Platform:00000] *** MPI_ERR_NAME: invalid name argument [parallels-Parallels-Virtual-Platform:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [parallels-Parallels-Virtual-Platform:00000] *** and MPI will try to terminate your MPI job as well) Thank you in advance for any pointer or documentation I could use; for additional context, I'd like to use 5.0.0rc3 since it's the last version with ULFM, and version 4.0.3 with ULFM is broken due to an issue on host recognition with ompi-server (related github issue: https://github.com/open-mpi/ompi/issues/9396 ) [https://opengraph.githubassets.com/b4f6a3b86e93ad2b498ae3fe86821328c172e85ed1c2f343e0fda6fc4391fb07/open-mpi/ompi/issues/9396]<https://github.com/open-mpi/ompi/issues/9396> client/server mechanism broken? · Issue #9396 · open-mpi/ompi<https://github.com/open-mpi/ompi/issues/9396> Thank you for taking the time to submit an issue! Background information What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.) v3.1.4 but have tried v4.1.0 ... github.com Luca Repetti