Hello, While working in a HPC support role, I was asked to resolve an apparent discrepancy between OpenMPI 'mpi_cart_rank' behavior and the MPI spec [1, 2] that says "[out]-of-range coordinates are erroneous for non-periodic dimensions." The observed behavior in our environment [3] was that mpi_cart_rank on a topology with non-periodic dimensions was returning an implicitly shifted value for a lookup of an invalid coordinate ('-1' for example). This behavior was caused by the compile time flag "--with-mpi-param-check=no" as included in the contrib/platform/mellanox/optimized file [4], which ultimately seems to disable the coordinate bounds checking happening at ompi/mpi/c/cart_rank.c#L85-L91 [5]. We initially thought this could be a bug, especially after reading 'MPI_Cart_rank: Out-of-range coordinates are erroneous for non-periodic dimensions' [6], but the realization that our build was disabling all parameter checking makes me a bit reluctant to call this a 'bug'.
I'm relatively new to the MPI world and have searched this list's archives for answers but found nothing really specific to my question. This is a general question for other OpenMPI users and cluster admins regarding the build optimization --with-mpi-param-check=no. I'm looking for opinions based on experience supporting diverse user code in shared OpenMPI installations: In the context of a shared cluster deployment in a high performance environment, are there good arguments for permanently disabling MPI parameter checking (--with-mpi-param-check=no)? To eliminate some runtime overhead in the functions that conditionally skip parameter validation? Is that overhead substantial? I haven't found any recommendations to use the configure flag '--with-mpi-param-check=no', apart from indirectly by incorporating the Mellanox platform optimized [4] file. Are any other site installers here intentionally (permanently) disabling parameter checking in shared installations? Anyone disabling parameter checking at runtime as a default? Are there other considerations? My impression is it would be safe to compile out parameter checking if you know your MPI code passes only legal parameter values to all MPI functions, otherwise it would be prudent to leave parameter checking enabled (or runtime disable-able). 1. MPI 4, 8.5.5, p406 2. MPI 3.1, 7.5.5, p305 3. OpenMPI 4.1.5 and 4.0.3 configured with "--with-platform=contrib/platform/mellanox/optimized", as found in https://linux.mellanox.com/public/repo/mlnx_ofed/5.8-3.0.7.0/rhel9.2/x86_64/openmpi-4.1.5a1-1.58307.x86_64.rpm (/usr/mpi/gcc/openmpi-4.1.5a1/bin/ompi_info | grep "Configure command") 4. https://github.com/open-mpi/ompi/blob/42b829b3b3190dd1987d113fd8c2810eb8584007/contrib/platform/mellanox/optimized#L55 5. https://github.com/open-mpi/ompi/blob/42b829b3b3190dd1987d113fd8c2810eb8584007/ompi/mpi/c/cart_rank.c#L85-L91 6. https://www.mail-archive.com/users@lists.open-mpi.org/msg07705.html Eli