Bug#1066735: mpich: fails to connect processes and report ranks with trivial mpi test

2024-03-26 Thread Samuel Thibault
Samuel Thibault, le mar. 26 mars 2024 18:38:22 +0100, a ecrit:
> Samuel Thibault, le ven. 15 mars 2024 10:31:54 +0100, a ecrit:
> > Lucas Nussbaum, le mer. 13 mars 2024 15:56:40 +0100, a ecrit:
> > I'm 0/1
> > I'm 0/1
> > 
> > and the same with a hosts file containing localhost twice.
> 
> I tried with disabling PMIX (commenting PMIX:=
> --with-pmix=/usr/lib/$(DEB_HOST_MULTIARCH)/pmix2), and that fixed it.
> 
> Unless somebody complains, I will NMU that change, to get back mpich
> working in unstable.

I have uploaded the attached change to DELAYED/2.

Samuel
diff -Nru mpich-4.2.0/debian/changelog mpich-4.2.0/debian/changelog
--- mpich-4.2.0/debian/changelog2024-02-27 09:59:43.0 +0100
+++ mpich-4.2.0/debian/changelog2024-03-26 22:40:26.0 +0100
@@ -1,3 +1,10 @@
+mpich (4.2.0-5.1) unstable; urgency=medium
+
+  * Non-maintainer upload.
+  * rules: Re-disable pmix: Closes: #1066735
+
+ -- Samuel Thibault   Tue, 26 Mar 2024 22:40:26 +0100
+
 mpich (4.2.0-5) unstable; urgency=medium
 
   * Install mod files in include dir until all deps updated
diff -Nru mpich-4.2.0/debian/rules mpich-4.2.0/debian/rules
--- mpich-4.2.0/debian/rules2024-02-27 09:59:43.0 +0100
+++ mpich-4.2.0/debian/rules2024-03-26 22:40:26.0 +0100
@@ -54,12 +54,12 @@
 PMIX:=
 ifeq (,$(findstring  $(DEB_HOST_ARCH),$(NO_CH4_ARCH)))
 DEVICE:= --with-device=ch4:ofi 
-   PMIX:=  --with-pmix=/usr/lib/$(DEB_HOST_MULTIARCH)/pmix2
+   #PMIX:=  --with-pmix=/usr/lib/$(DEB_HOST_MULTIARCH)/pmix2
 endif
 ifneq (,$(filter  $(DEB_HOST_ARCH),$(UCX_ARCH)))
 DEVICE:= --with-device=ch4:ucx
UCX:= --with-ucx=/usr
-   PMIX:=  --with-pmix=/usr/lib/$(DEB_HOST_MULTIARCH)/pmix2
+   #PMIX:=  --with-pmix=/usr/lib/$(DEB_HOST_MULTIARCH)/pmix2
 endif
 
 extra_flags += \


Bug#1066735: mpich: fails to connect processes and report ranks with trivial mpi test

2024-03-26 Thread Samuel Thibault
Hello,

Samuel Thibault, le ven. 15 mars 2024 10:31:54 +0100, a ecrit:
> Lucas Nussbaum, le mer. 13 mars 2024 15:56:40 +0100, a ecrit:
> > > [P0T0] Starting EZTrace (pid: 878489)...
> > > [P0T0] MPI mode selected
> > > This program requires 2 MPI processes, aborting...
> > > dir: mpi_ping_trace
> > > /bin/rm: cannot remove 'mpi_ping_trace': Directory not empty
> > > [P0T0] Stopping EZTrace (pid:878489)...
> > > [P0T0] Starting EZTrace (pid: 878488)...
> > > [P0T0] MPI mode selected
> > > This program requires 2 MPI processes, aborting...
> > > [P0T0] Stopping EZTrace (pid:878488)...
> > >  [OK] 
> 
> The test does run 2 processes. I tried this:
> 
> $ cat test.c
> #include 
> #include 
> int main(int argc, char *argv[]) {
>   int rank, size;
>   MPI_Init(, );
>   MPI_Comm_rank(MPI_COMM_WORLD, );
>   MPI_Comm_size(MPI_COMM_WORLD, );
>   printf("I'm %d/%d\n", rank, size);
>   return 0;
> }
> 
> And it reports:
> 
> $ mpirun -np 2 ./test
> Authorization required, but no authorization protocol specified
> 
> Authorization required, but no authorization protocol specified
> 
> Authorization required, but no authorization protocol specified
> 
> Authorization required, but no authorization protocol specified
> 
> I'm 0/1
> I'm 0/1
> 
> and the same with a hosts file containing localhost twice.

I tried with disabling PMIX (commenting PMIX:=
--with-pmix=/usr/lib/$(DEB_HOST_MULTIARCH)/pmix2), and that fixed it.

Unless somebody complains, I will NMU that change, to get back mpich
working in unstable.

Samuel



Processed: Re: Bug#1066735: mpich: fails to connect processes and report ranks with trivial mpi test

2024-03-15 Thread Debian Bug Tracking System
Processing control commands:

> notfound -1 4.1.2-3
Bug #1066735 [mpich] mpich: fails to connect processes and report ranks
Ignoring request to alter found versions of bug #1066735 to the same values 
previously set

-- 
1066735: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1066735
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1066735: mpich: fails to connect processes and report ranks with trivial mpi test

2024-03-15 Thread Samuel Thibault
Control: notfound -1 4.1.2-3

Samuel Thibault, le ven. 15 mars 2024 10:31:54 +0100, a ecrit:
> $ mpirun -np 2 ./test
> Authorization required, but no authorization protocol specified
> 
> Authorization required, but no authorization protocol specified
> 
> Authorization required, but no authorization protocol specified
> 
> Authorization required, but no authorization protocol specified
> 
> I'm 0/1
> I'm 0/1

Note: this is new with mpich 4.2.0, 4.1.2-3 is fine.

Samuel



Processed: Re: Bug#1066735: mpich: fails to connect processes and report ranks with trivial mpi test

2024-03-15 Thread Debian Bug Tracking System
Processing control commands:

> reassign -1 mpich
Bug #1066735 [src:eztrace] eztrace: FTBFS: dh_auto_test: error: cd build-mpich 
&& make -j1 test ARGS\+=--verbose ARGS\+=-j1 -k ARGS\+=--extra-verbose returned 
exit code 2
Bug reassigned from package 'src:eztrace' to 'mpich'.
No longer marked as found in versions eztrace/2.1-6.
Ignoring request to alter fixed versions of bug #1066735 to the same values 
previously set
> retitle -1 mpich: fails to connect processes and report ranks
Bug #1066735 [mpich] eztrace: FTBFS: dh_auto_test: error: cd build-mpich && 
make -j1 test ARGS\+=--verbose ARGS\+=-j1 -k ARGS\+=--extra-verbose returned 
exit code 2
Changed Bug title to 'mpich: fails to connect processes and report ranks' from 
'eztrace: FTBFS: dh_auto_test: error: cd build-mpich && make -j1 test 
ARGS\+=--verbose ARGS\+=-j1 -k ARGS\+=--extra-verbose returned exit code 2'.
> affects -1 + eztrace
Bug #1066735 [mpich] mpich: fails to connect processes and report ranks
Added indication that 1066735 affects eztrace

-- 
1066735: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1066735
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1066735: mpich: fails to connect processes and report ranks with trivial mpi test

2024-03-15 Thread Samuel Thibault
Control: reassign -1 mpich
Control: retitle -1 mpich: fails to connect processes and report ranks
Control: affects -1 + eztrace

Hello,

Lucas Nussbaum, le mer. 13 mars 2024 15:56:40 +0100, a ecrit:
> > [P0T0] Starting EZTrace (pid: 878489)...
> > [P0T0] MPI mode selected
> > This program requires 2 MPI processes, aborting...
> > dir: mpi_ping_trace
> > /bin/rm: cannot remove 'mpi_ping_trace': Directory not empty
> > [P0T0] Stopping EZTrace (pid:878489)...
> > [P0T0] Starting EZTrace (pid: 878488)...
> > [P0T0] MPI mode selected
> > This program requires 2 MPI processes, aborting...
> > [P0T0] Stopping EZTrace (pid:878488)...
> >  [OK] 

The test does run 2 processes. I tried this:

$ cat test.c
#include 
#include 
int main(int argc, char *argv[]) {
int rank, size;
MPI_Init(, );
MPI_Comm_rank(MPI_COMM_WORLD, );
MPI_Comm_size(MPI_COMM_WORLD, );
printf("I'm %d/%d\n", rank, size);
return 0;
}

And it reports:

$ mpirun -np 2 ./test
Authorization required, but no authorization protocol specified

Authorization required, but no authorization protocol specified

Authorization required, but no authorization protocol specified

Authorization required, but no authorization protocol specified

I'm 0/1
I'm 0/1

and the same with a hosts file containing localhost twice.

Samuel