Your message dated Fri, 29 Nov 2024 08:36:18 +0000
with message-id <[email protected]>
and subject line Bug#1087800: fixed in openmpi 5.0.6-2
has caused the Debian Bug report #1087800,
regarding pmix: apply upstream patch for bigendian alignment
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
1087800: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087800
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Source: pmix
Version: 5.0.3-2
Severity: normal
Control: affects -1 src:mpi4py
Control: forwarded -1 https://github.com/openpmix/prrte/pull/2075
OpenMPI occasionally crashes via PMIX, especially on the less common
architectures, in ways that are hard to debug.
For instance mpi4py is currently FTBFS on s390x (and ppc64) with test
failure in DPM:
[sbuild:00340] PMIX ERROR: PMIX_ERR_BAD_PARAM in file
../../../../../3rd-party/prrte/src/runtime/prte_data_server.c at line 270
[sbuild:00340] PMIX ERROR: PMIX_ERR_BAD_PARAM in file
../../../../../3rd-party/prrte/src/runtime/prte_data_server.c at line 270
[sbuild:00340] PRTE ERROR: Bad parameter in file
../../../../../3rd-party/prrte/src/runtime/prte_data_server.c at line 487
[sbuild:00340] PRTE ERROR: Bad parameter in file
../../../../../3rd-party/prrte/src/runtime/prte_data_server.c at line 487
testJoin (test_dynproc.TestDPM.testJoin) ... [sbuild:00344] *** Process
received signal ***
[sbuild:00343] *** Process received signal ***
[sbuild:00343] Signal: Segmentation fault (11)
[sbuild:00343] Signal code: Address not mapped (1)
[sbuild:00343] Failing at address: (nil)
[sbuild:00344] Signal: Segmentation fault (11)
[sbuild:00344] Signal code: Address not mapped (1)
[sbuild:00344] Failing at address: (nil)
[sbuild:00343] [ 0] linux-vdso64.so.1(__kernel_rt_sigreturn+0x0) [0x3fffbbfe480]
[sbuild:00343] [ 1]
/lib/s390x-linux-gnu/libmpi.so.40(ompi_dpm_connect_accept+0x4d4) [0x3ffba1f431c]
[sbuild:00343] [ 2] /lib/s390x-linux-gnu/libmpi.so.40(PMPI_Comm_join+0x270)
[0x3ffba22cfd0]
[sbuild:00343] [ 3]
/<<PKGBUILDDIR>>/.pybuild/cpython3_3.13/build/mpi4py/MPI.cpython-313-s390x-linux-gnu.so(+0x176990)
[0x3ffba6f6990]
see
https://buildd.debian.org/status/fetch.php?pkg=mpi4py&arch=s390x&ver=4.0.1-3&stamp=1731894519&raw=0
https://github.com/mpi4py/mpi4py/issues/586
https://github.com/openpmix/openpmix/issues/3447
It seems to be a bigendian issue, and not many upstream developers are
able to test for them.
However for this issue, upstream suspects prrte PR#2075 might fix the
problem. It's a small patch, should be safe to apply in any case.
Can we apply it and see if it fixes some s390x PMIX problems?
https://github.com/openpmix/prrte/pull/2075
https://patch-diff.githubusercontent.com/raw/openpmix/prrte/pull/2075.patch
--- a/src/runtime/prte_data_server.c
+++ b/src/runtime/prte_data_server.c
@@ -15,7 +15,7 @@
* Copyright (c) 2015-2020 Intel, Inc. All rights reserved.
* Copyright (c) 2017-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
- * Copyright (c) 2021-2022 Nanook Consulting. All rights reserved.
+ * Copyright (c) 2021-2024 Nanook Consulting All rights reserved.
* $COPYRIGHT$
*
* Additional copyrights may follow
@@ -182,7 +182,8 @@ void prte_data_server(int status, pmix_proc_t *sender,
prte_data_object_t *data;
pmix_data_buffer_t *answer, *reply;
int rc, k;
- uint32_t ninfo, i;
+ size_t ninfo;
+ uint32_t i;
char **keys = NULL, *str;
bool wait = false;
int room_number;
--- End Message ---
--- Begin Message ---
Source: openmpi
Source-Version: 5.0.6-2
Done: Alastair McKinstry <[email protected]>
We believe that the bug you reported is fixed in the latest version of
openmpi, which is due to be installed in the Debian FTP archive.
A summary of the changes between this version and the previous one is
attached.
Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to [email protected],
and the maintainer will reopen the bug report if appropriate.
Debian distribution maintenance software
pp.
Alastair McKinstry <[email protected]> (supplier of updated openmpi package)
(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing [email protected])
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Format: 1.8
Date: Wed, 20 Nov 2024 11:33:09 +0000
Source: openmpi
Architecture: source
Version: 5.0.6-2
Distribution: unstable
Urgency: medium
Maintainer: Debian Science Maintainers
<[email protected]>
Changed-By: Alastair McKinstry <[email protected]>
Closes: 1087800
Changes:
openmpi (5.0.6-2) unstable; urgency=medium
.
* Typo in libprrte.so link
* Upstream patch for prte type error. Closes: #1087800
Checksums-Sha1:
b3bcf2eb8f527021767fd2dc10ab216ef46e3b53 2820 openmpi_5.0.6-2.dsc
03f75d70eee8eeecbb3ea7831b72892f63cddce3 68808 openmpi_5.0.6-2.debian.tar.xz
Checksums-Sha256:
12e2e26be7e33e675eb2a069ac3dc6dbbebc5be70f8aa0562f8ca44a7bcd9636 2820
openmpi_5.0.6-2.dsc
a89628d13269dbdf78e77ebd62f74131fa0d9a9d7bf4febd55f5e118ba9f9b09 68808
openmpi_5.0.6-2.debian.tar.xz
Files:
e1f63ba9a263d29f1ee3905263656198 2820 net optional openmpi_5.0.6-2.dsc
84450cc3432614d3d400e1f32c82d351 68808 net optional
openmpi_5.0.6-2.debian.tar.xz
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEgjg86RZbNHx4cIGiy+a7Tl2a06UFAmdJdVwACgkQy+a7Tl2a
06X93Q//XL0Tn6fzZThDuCH+8r2y5DEhf6kx5SlovCwGKWUr5R6JmJVIEskt8GDi
U58vyFzLrZ5vVa5TQNarhjsq/1d/xlR7C41sQg98pfOeIcXhoCttURyfrPvZehDj
vZbTt2bJJ/gdXU87myv4nQEH61B5TlcS5dzdaG97AKzM3MCfKFeFohIRrQLdRxj2
gpJRjcHkM8xv5tjJyO0xAmrmfFtnhGxuqOtVcE0WjvPBE1SM1FGLc+cpQHx1ferk
+xTY7DIMI4bFua1TxzEAk6Gjt7ii0faCzBhW97uwjg0qlv6jtkwlGHxWL1ElB6VH
qOQaNohfbgIhGNtPUVsU5WLNXosvmTarZW4IHN4cuNE7AifFjuMa4/KmHbXFXG4u
K8liIpW+MDNpTVyO6W7+f20lXuQACtBJRI5WKxqiGrk3wu6rXsxe/UntRdS2mvrb
BNrd4sw3rLKrzebNh0ArnVUdp4LsSUod0xLpQ/P/7tnbKzvimgNT8DlPP+8PNu73
ZYSGXKlU6QiCE6BSoVYtuNZ1pG9KDRwaMJBuq9Lh9AsIecfI+9bd2PCcVhPNA2VC
JVlx71TYZKt6CKQ4sP1/cjNT9aU6YXhNtLdnVRCqmB87Gxrl0KaIweMyP+rvRVN5
4Cp7ojQCsFP0YbCUbjTW9i6g2NxGpLWbGNSjW2qTHrdIDbpfidk=
=TX86
-----END PGP SIGNATURE-----
pgpkn_tW7XaQL.pgp
Description: PGP signature
--- End Message ---
--
debian-science-maintainers mailing list
[email protected]
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/debian-science-maintainers