Package: libopenmpi3
Version: 4.1.0-3
Severity: serious

The revert of pmix has fixed some issues, but python packages still show
autopkgtest regressions in dolfin[1], gpaw[2], gyoto[3] and mshr[4] .
The error is always like this:

|A process has executed an operation involving a call
|to the fork() system call to create a child process.
|
|As a result, the libfabric EFA provider is operating in
|a condition that could result in memory corruption or
|other system errors.
|
|For the libfabric EFA provider to work safely when fork()
|is called, you will need to set the following environment
|variable:
|          RDMAV_FORK_SAFE
|
|However, setting this environment variable can result in
|signficant performance impact to your application due to
|increased cost of memory registration.
|
|You may want to check with your application vendor to see
|if an application-level alternative (of not using fork)
|exists.
|
|Your job will now abort.

If I export RDMAV_FORK_SAFE=1, the tests run fine, but (i) it seems
something in OpenMPI has changed so that those programs no longer run
and (ii) the warnings about performance issues are to be considered.

Also note that it seems those errors only happen on amd64/i386, the ARM
ports run fine, maybe because of missing libfabric-related
features/packages?


Michael

[1] https://ci.debian.net/data/autopkgtest/testing/amd64/d/dolfin/9184050/log.gz
[2] https://ci.debian.net/data/autopkgtest/testing/amd64/g/gpaw/9302177/log.gz
[3] https://ci.debian.net/data/autopkgtest/testing/amd64/g/gyoto/9303088/log.gz
[4] https://ci.debian.net/data/autopkgtest/testing/amd64/m/mshr/9300183/log.gz

Reply via email to