On 02/09/21 at 00:16 +0200, Lucas Nussbaum wrote:
> reopen 979041
> notfixed 979041 4.1.0-7
> thanks
>
> Hi,
>
> I ran into this problem with 4.1.0-7. The ofi BTL was disabled, but not
> the ofi MTL. In some cases, both need to be disabled.
>
> You need something like:
> mtl = ^ofi
> in
reopen 979041
notfixed 979041 4.1.0-7
thanks
Hi,
I ran into this problem with 4.1.0-7. The ofi BTL was disabled, but not
the ofi MTL. In some cases, both need to be disabled.
You need something like:
mtl = ^ofi
in addition to:
btl = ^ofi
(I ran into this with https://github.com/LLNL/mpiGraph
red, and you may need to re-add them.
Bug reopened
No longer marked as fixed in versions openmpi/4.1.0-7.
> notfixed 979041 4.1.0-7
Bug #979041 [libopenmpi3] libopempi3: aborts python code due to libfabric
fork() issues
Ignoring request to alter fixed versions of bug #979041 to the same values
previous
On 2021-01-15 23:14, Alastair McKinstry wrote:
Ugh. Thanks Drew.
What are the contents of /etc/openmpi/openmpi-mca-params.conf on the
node?
Does a simple hello world (see Debian/tests/hello* ) work without
errors in the environment ?
Hi Alastair, sorry for the delay replying to these
Package: libopenmpi3
Version: 4.1.0-6
Followup-For: Bug #979041
Control: reopen 979041
We need to reopen this bug unfortunately. The libfabric
(RDMAV_FORK_SAFE) issue is still live in python MPI applications.
You can see it in pytest-mpi tests as reported previously,
or in a rebuild of mpi4py,
Ugh. Thanks Drew.
What are the contents of /etc/openmpi/openmpi-mca-params.conf on the node?
Does a simple hello world (see Debian/tests/hello* ) work without errors in the
environment ?
Regards
Alastair
On 15/01/2021, 08:39, "Drew Parsons" wrote:
Package: libopenmpi3
Version:
Package: libopenmpi3
Version: 4.1.0-5
Followup-For: Bug #979041
There's evidence this libfabric bug is not fully fixed.
pytest-mpi (0.4-3) is failing tests:
A process has executed an operation involving a call
to the fork() system call to create a child process.
As a result, the
Package: libopenmpi3
Version: 4.1.0-3
Severity: serious
The revert of pmix has fixed some issues, but python packages still show
autopkgtest regressions in dolfin[1], gpaw[2], gyoto[3] and mshr[4] .
The error is always like this:
|A process has executed an operation involving a call
|to the
8 matches
Mail list logo