Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-10 Thread Ralph Castain via devel
Hi marco

I added the libtool tweak to PMIx and changed the "interface" variable in PRRTE 
to "intf" - hopefully, the offending header didn't define that one too!

I'm not sure of the problem you're encountering, but I do note the PMIx error 
message:

> [116] 
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/openpmix/src/mca/ptl/base/ptl_base_listener.c:498
>  bind() failed for socket 13 storage size 16: Cannot assign requested address

IIRC, we may have had problems with sockets in Cygwin before, yes? You might 
need to look at the referenced code area to see if there needs to be some 
Cygwin-related tweak.

Ralph

> On Jan 9, 2022, at 11:09 PM, Marco Atzeri via devel 
>  wrote:
> 
> On 10.01.2022 06:50, Marco Atzeri wrote:
>> On 09.01.2022 15:54, Ralph Castain via devel wrote:
>>> Hi Marco
>>> 
>>> Try the patch here (for the prrte 3rd-party subdirectory): 
>>> https://github.com/openpmix/prrte/pull/1173
>>> 
>>> 
>>> Ralph
>>> 
>> Thanks Ralph,
>> I will do on the next build
>> as I need still to test the current build.
> 
> The test are not satisfactory
> 
> I have only one test fail
>  FAIL: dlopen_test.exe
> 
> that I supect is due to a wrong name on test
> 
> but a simple run fails
> 
> $ mpirun -n 4 ./hello_c.exe
> [116] 
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/openpmix/src/mca/ptl/base/ptl_base_listener.c:498
>  bind() failed for socket 13 storage size 16: Cannot assign requested address
> Hello, world, I am 0 of 1, (Open MPI v5.0.0rc2, package: Open MPI 
> Marco@LAPTOP-82F08ILC Distribution, ident: 5.0.0rc2, repo rev: v5.0.0rc2, Oct 
> 18, 2021, 125)
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> 
>  PML add procs failed
>  --> Returned "Not found" (-13) instead of "Success" (0)
> --
> [LAPTOP-82F08ILC:0] *** An error occurred in MPI_Init
> [LAPTOP-82F08ILC:0] *** reported by process [36547002369,1]
> [LAPTOP-82F08ILC:0] *** on a NULL communicator
> [LAPTOP-82F08ILC:0] *** Unknown error
> [LAPTOP-82F08ILC:0] *** MPI_ERRORS_ARE_FATAL (processes in this 
> communicator will now abort,
> [LAPTOP-82F08ILC:0] ***and MPI will try to terminate your MPI job as 
> well)
> --
> 
> Suggestion for what to look for ?
> 
> Regards
> Marco
> 
> 




Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Marco Atzeri via devel

On 10.01.2022 06:50, Marco Atzeri wrote:

On 09.01.2022 15:54, Ralph Castain via devel wrote:

Hi Marco

Try the patch here (for the prrte 3rd-party subdirectory): 
https://github.com/openpmix/prrte/pull/1173



Ralph



Thanks Ralph,

I will do on the next build
as I need still to test the current build.



The test are not satisfactory

I have only one test fail
  FAIL: dlopen_test.exe

that I supect is due to a wrong name on test

but a simple run fails

$ mpirun -n 4 ./hello_c.exe
[116] 
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/openpmix/src/mca/ptl/base/ptl_base_listener.c:498 
bind() failed for socket 13 storage size 16: Cannot assign requested address
Hello, world, I am 0 of 1, (Open MPI v5.0.0rc2, package: Open MPI 
Marco@LAPTOP-82F08ILC Distribution, ident: 5.0.0rc2, repo rev: 
v5.0.0rc2, Oct 18, 2021, 125)

--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--
[LAPTOP-82F08ILC:0] *** An error occurred in MPI_Init
[LAPTOP-82F08ILC:0] *** reported by process [36547002369,1]
[LAPTOP-82F08ILC:0] *** on a NULL communicator
[LAPTOP-82F08ILC:0] *** Unknown error
[LAPTOP-82F08ILC:0] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,
[LAPTOP-82F08ILC:0] ***and MPI will try to terminate your MPI 
job as well)

--

Suggestion for what to look for ?

Regards
Marco




Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Marco Atzeri via devel

On 09.01.2022 15:54, Ralph Castain via devel wrote:

Hi Marco

Try the patch here (for the prrte 3rd-party subdirectory): 
https://github.com/openpmix/prrte/pull/1173


Ralph



Thanks Ralph,

I will do on the next build
as I need still to test the current build.


To complete the build I also needed the attached:


5.0.0rc2-amend-DEF.patch
to correct a missing def in OPAL

CYGWIN-undefined.patch
to pass "-no-undefined" to pmix and romio to allow shared lib

CYGWIN-interface-workaround.patch
temporary workaround to avoid the "interface" collision with 
Windows headers

I will look to provide a better solution for this point.

Regards
Marco
--- origsrc/openmpi-5.0.0rc2/opal/util/minmax.h 2021-10-18 17:27:42.0 
+0200
+++ src/openmpi-5.0.0rc2/opal/util/minmax.h 2022-01-09 12:46:07.148969800 
+0100
@@ -40,7 +40,7 @@ OPAL_DEFINE_MINMAX(float, float)
 OPAL_DEFINE_MINMAX(double, double)
 OPAL_DEFINE_MINMAX(void *, ptr)
 
-#if OPAL_C_HAVE__GENERIC
+#ifdef OPAL_C_HAVE__GENERIC
 #define opal_min(a, b) \
 (_Generic((a) + (b),\
  int8_t: opal_min_8,\
--- 
origsrc/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c   
2021-10-18 17:28:05.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c   
2022-01-09 11:13:29.687212300 +0100
@@ -160,6 +160,9 @@ void prte_oob_tcp_peer_try_connect(int f
 prte_oob_tcp_peer_t *peer;
 prte_oob_tcp_addr_t *addr;
 bool connected = false;
+#if defined interface
+#  undef interface
+#endif
 prte_if_t *interface;
 char *host;
 
--- origsrc/openmpi-5.0.0rc2/opal/mca/btl/tcp/btl_tcp_proc.c2021-10-18 
17:27:42.0 +0200
+++ src/openmpi-5.0.0rc2/opal/mca/btl/tcp/btl_tcp_proc.c2022-01-09 
12:48:27.075225600 +0100
@@ -160,6 +160,9 @@ static int mca_btl_tcp_proc_create_inter
fields needed in the proc version */
 for (i = 0; i < btl_proc->proc_addr_count; i++) {
 /* Construct opal_if_t objects for the remote interfaces */
+#ifdef interface
+#  undef interface
+#endif
 opal_if_t *interface = OBJ_NEW(opal_if_t);
 if (NULL == interface) {
 rc = OPAL_ERR_OUT_OF_RESOURCE;
--- origsrc/openmpi-5.0.0rc2/3rd-party/openpmix/configure.ac2021-10-18 
17:28:09.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/openpmix/configure.ac2022-01-09 
04:51:14.271907200 +0100
@@ -337,6 +337,23 @@ AC_ARG_ENABLE(werror,
 ])
 
 
+# no-undefined needed on some platform for shared lib
+
+
+AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared 
libraries])
+case "`uname`" in
+  CYGWIN*|MINGW*|AIX*)
+## Add in the -no-undefined flag to LDFLAGS for libtool.
+AC_MSG_RESULT([yes])
+LDFLAGS="$LDFLAGS -no-undefined"
+;;
+  *)
+## Don't add in anything.
+AC_MSG_RESULT([no])
+;;
+esac
+
+
 # Version information
 
 
--- origsrc/openmpi-5.0.0rc2/3rd-party/romio341/configure.ac2021-10-18 
17:27:42.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/romio341/configure.ac2022-01-09 
04:58:32.150499100 +0100
@@ -1784,6 +1784,19 @@ AM_PROG_LIBTOOL
 # support gcov test coverage information
 PAC_ENABLE_COVERAGE
 
+AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared 
libraries])
+case "`uname`" in
+  CYGWIN*|MINGW*|AIX*)
+## Add in the -no-undefined flag to LDFLAGS for libtool.
+AC_MSG_RESULT([yes])
+LDFLAGS="$LDFLAGS -no-undefined"
+;;
+  *)
+## Don't add in anything.
+AC_MSG_RESULT([no])
+;;
+esac
+
 AC_MSG_NOTICE([setting CC to $CC])
 AC_MSG_NOTICE([setting F77 to $F77])
 AC_MSG_NOTICE([setting TEST_CC to $TEST_CC])
--- origsrc/openmpi-5.0.0rc2/3rd-party/romio341/mpl/configure.ac
2021-10-18 17:27:42.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/romio341/mpl/configure.ac2022-01-09 
05:01:27.462723200 +0100
@@ -1077,6 +1077,19 @@ CFLAGS=""
 AX_GCC_FUNC_ATTRIBUTE(fallthrough)
 PAC_POP_ALL_FLAGS
 
+AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared 
libraries])
+case "`uname`" in
+  CYGWIN*|MINGW*|AIX*)
+## Add in the -no-undefined flag to LDFLAGS for libtool.
+AC_MSG_RESULT([yes])
+LDFLAGS="$LDFLAGS -no-undefined"
+;;
+  *)
+## Don't add in anything.
+AC_MSG_RESULT([no])
+;;
+esac
+
 dnl Final output
 AC_CONFIG_FILES([Makefile localdefs include/mpl_timer.h])
 AC_OUTPUT


Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Ralph Castain via devel
Hi Marco

Try the patch here (for the prrte 3rd-party subdirectory): 
https://github.com/openpmix/prrte/pull/1173


Ralph

> On Jan 9, 2022, at 12:29 AM, Marco Atzeri via devel 
>  wrote:
> 
> On 01.01.2022 20:07, Barrett, Brian wrote:
>> Marco -
>> There are some patches that haven't made it to the 5.0 branch to make this 
>> behavior better.  I didn't get a chance to back port them before the holiday 
>> break, but they will be in the next RC.  That said, the issue below is a 
>> warning, not an error, so you should still end up with a build that works 
>> (with an included PMIx).  The issue is that png-config can't be found, so we 
>> have trouble guessing what libraries are dependencies of PMIx, which is a 
>> potential problem in complicated builds with static libraries.
>> Brian
> 
> Thanks Brian,
> 
> the build error was in reality in setting for threads.
> 
> I was using up to v4.1
> 
>  --with-threads=posix
> 
> that currently is not accepted anymore but no error is reported,
> causing a different setting that does not work in CYGWIN.
> Removing the configuration seems to work
> 
> 
> I have however found a logic error in prrte that
> probably need a verification of all the HAVE_*_H
> between configuration and code
> 
> 
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/odls/default/odls_default_module.c:114:14:
>  fatal error: sys/ptrace.h: No such file or directory
>  114 | #include 
>  |  ^~
> 
> caused by
> 
> $ grep -rH HAVE_SYS_PTRACE_H .
> ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
> ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
> ./3rd-party/prrte/config.log:#define HAVE_SYS_PTRACE_H 0
> ./3rd-party/prrte/config.status:D["HAVE_SYS_PTRACE_H"]=" 0"
> ./3rd-party/prrte/src/include/prte_config.h:#define HAVE_SYS_PTRACE_H 0
> 
> while the code in
>3rd-party/prrte/src/mca/odls/default/odls_default_module.c
> has
> 
> #ifdef HAVE_SYS_PTRACE_H
> #include 
> #endif
> 
> 
> currently I am stacked at
> 
> 0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:61:
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/s
> rc/mca/oob/tcp/oob_tcp_connection.c: In function 
> ‘prte_oob_tcp_peer_try_connect’:
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:163:16:
>  error: expected identifier or ‘(’ before ‘struct’
>  163 | prte_if_t *interface;
>  |^
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:180:19:
>  error: expected ‘{’ before ‘=’ token
>  180 | interface = PRTE_NEW(prte_if_t);
>  |   ^
> 
> 
> not sure if it is caused by new GCC 11 requirement or from wrong headers
> being pulled in.
> 
> Has anyone built with GCC 11 ?
> 
> Regards
> Marco
> 




Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Marco Atzeri via devel

On 01.01.2022 20:07, Barrett, Brian wrote:

Marco -

There are some patches that haven't made it to the 5.0 branch to make this 
behavior better.  I didn't get a chance to back port them before the holiday 
break, but they will be in the next RC.  That said, the issue below is a 
warning, not an error, so you should still end up with a build that works (with 
an included PMIx).  The issue is that png-config can't be found, so we have 
trouble guessing what libraries are dependencies of PMIx, which is a potential 
problem in complicated builds with static libraries.

Brian



Thanks Brian,

the build error was in reality in setting for threads.

I was using up to v4.1

  --with-threads=posix

that currently is not accepted anymore but no error is reported,
causing a different setting that does not work in CYGWIN.
Removing the configuration seems to work


I have however found a logic error in prrte that
probably need a verification of all the HAVE_*_H
between configuration and code


/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/odls/default/odls_default_module.c:114:14: 
fatal error: sys/ptrace.h: No such file or directory

  114 | #include 
  |  ^~

caused by

$ grep -rH HAVE_SYS_PTRACE_H .
./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
./3rd-party/prrte/config.log:#define HAVE_SYS_PTRACE_H 0
./3rd-party/prrte/config.status:D["HAVE_SYS_PTRACE_H"]=" 0"
./3rd-party/prrte/src/include/prte_config.h:#define HAVE_SYS_PTRACE_H 0

while the code in
3rd-party/prrte/src/mca/odls/default/odls_default_module.c
has

#ifdef HAVE_SYS_PTRACE_H
#include 
#endif


currently I am stacked at

0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:61:
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/s
rc/mca/oob/tcp/oob_tcp_connection.c: In function 
‘prte_oob_tcp_peer_try_connect’:
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:163:16: 
error: expected identifier or ‘(’ before ‘struct’

  163 | prte_if_t *interface;
  |^
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:180:19: 
error: expected ‘{’ before ‘=’ token

  180 | interface = PRTE_NEW(prte_if_t);
  |   ^


not sure if it is caused by new GCC 11 requirement or from wrong headers
being pulled in.

Has anyone built with GCC 11 ?

Regards
Marco



Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-01 Thread Barrett, Brian via devel
Marco -

There are some patches that haven't made it to the 5.0 branch to make this 
behavior better.  I didn't get a chance to back port them before the holiday 
break, but they will be in the next RC.  That said, the issue below is a 
warning, not an error, so you should still end up with a build that works (with 
an included PMIx).  The issue is that png-config can't be found, so we have 
trouble guessing what libraries are dependencies of PMIx, which is a potential 
problem in complicated builds with static libraries.

Brian


From: devel  on behalf of Marco Atzeri via 
devel 
Sent: Wednesday, December 22, 2021 9:09 AM
To: devel@lists.open-mpi.org
Cc: Marco Atzeri
Subject: RE: [EXTERNAL] [OMPI devel] Announcing Open MPI v5.0.0rc2

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



On 18.10.2021 20:39, Austen W Lauria via devel wrote:
> The second release candidate for the Open MPI v5.0.0 release is posted
> at: https://www.open-mpi.org/software/ompi/v5.0/
> 


Question:
there is a easy way to configure and build the 3rd party included
in the package source for a simple build ?


configure: = done with 3rd-party/openpmix configure =
checking for pmix.h... no
configure: Looking for pc file for pmix
Package pmix was not found in the pkg-config search path.
Perhaps you should add the directory containing `pmix.pc'
to the PKG_CONFIG_PATH environment variable
Package 'pmix', required by 'virtual:world', not found
configure: WARNING: Could not find viable pmix.pc


Regards
Marco

Cygwin package maintainer


Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2021-12-22 Thread Marco Atzeri via devel

On 18.10.2021 20:39, Austen W Lauria via devel wrote:
The second release candidate for the Open MPI v5.0.0 release is posted 
at: https://www.open-mpi.org/software/ompi/v5.0/ 




Question:
there is a easy way to configure and build the 3rd party included
in the package source for a simple build ?


configure: = done with 3rd-party/openpmix configure =
checking for pmix.h... no
configure: Looking for pc file for pmix
Package pmix was not found in the pkg-config search path.
Perhaps you should add the directory containing `pmix.pc'
to the PKG_CONFIG_PATH environment variable
Package 'pmix', required by 'virtual:world', not found
configure: WARNING: Could not find viable pmix.pc


Regards
Marco

Cygwin package maintainer